waveletdeboshir commited on
Commit
a33e071
1 Parent(s): 8c17c7e

Add usage example

Browse files
Files changed (1) hide show
  1. README.md +25 -0
README.md CHANGED
@@ -45,5 +45,30 @@ Model size is 15% less then original whisper-small:
45
 
46
  You can fine-tune this model on your data to achive better performance.
47
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
  ## Colab for pruning
49
  TODO
 
45
 
46
  You can fine-tune this model on your data to achive better performance.
47
 
48
+ ## Usage
49
+ Model can be used as an original whisper:
50
+
51
+ ```python
52
+ >>> from transformers import WhisperProcessor, WhisperForConditionalGeneration
53
+ >>> import torchaudio
54
+
55
+ >>> # load audio
56
+ >>> wav, sr = torchaudio.load("audio.wav")
57
+
58
+ >>> # load model and processor
59
+ >>> processor = WhisperProcessor.from_pretrained("waveletdeboshir/whisper-small-ru-pruned")
60
+ >>> model = WhisperForConditionalGeneration.from_pretrained("waveletdeboshir/whisper-small-ru-pruned")
61
+
62
+ >>> input_features = processor(wav[0], sampling_rate=sr, return_tensors="pt").input_features
63
+
64
+ >>> # generate token ids
65
+ >>> predicted_ids = model.generate(input_features)
66
+ >>> # decode token ids to text
67
+ >>> transcription = processor.batch_decode(predicted_ids, skip_special_tokens=False)
68
+ ['<|startoftranscript|><|ru|><|transcribe|><|notimestamps|> Начинаем работу.<|endoftext|>']
69
+
70
+ ```
71
+ The context tokens can be removed from the start of the transcription by setting `skip_special_tokens=True`.
72
+
73
  ## Colab for pruning
74
  TODO