waveletdeboshir
/

whisper-small-ru-pruned

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Community

waveletdeboshir commited on Aug 19

Commit

c21debc

•

1 Parent(s): bb8b24a

Update numbers

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -22,15 +22,15 @@ This is a pruned version of [openai/whisper-small](https://huggingface.co/openai
 Pruning was made without any fine-tuning. Method from [this post](https://medium.com/m/global-identity-2?redirectUrl=https%3A%2F%2Ftowardsdatascience.com%2Fhow-to-adapt-a-multilingual-t5-model-for-a-single-language-b9f94f3d9c90) was used.
 ## Size
-Only 10% tokens was left including special whisper tokens, added whisper tokens, 100 most popular tokens from tokenizer and 3000 most popular Russian tokens computed by tokenization of russian text corpus.
 Model size is 15%  less then original whisper-small:
 |  | openai/whisper-small | waveletdeboshir/whisper-small-ru-pruned |
 | :------ | :------ | :------ |
 | n of parameters | 242 M | 205 M |
-| n of parameters (with proj_out layer) | 281 M | 209 M |
-| model file size | 967 Mb | 837 Mb |
-| vocab_size | 51865 | 4705 |
 ## Usage
 Model can be used as an original whisper:

 Pruning was made without any fine-tuning. Method from [this post](https://medium.com/m/global-identity-2?redirectUrl=https%3A%2F%2Ftowardsdatascience.com%2Fhow-to-adapt-a-multilingual-t5-model-for-a-single-language-b9f94f3d9c90) was used.
 ## Size
+Only 10% tokens was left including special whisper tokens (no language tokens except \<|ru|\> and \<|en|\>, no timestamp tokens), 200 most popular tokens from tokenizer and 4000 most popular Russian tokens computed by tokenization of russian text corpus.
 Model size is 15%  less then original whisper-small:
 |  | openai/whisper-small | waveletdeboshir/whisper-small-ru-pruned |
 | :------ | :------ | :------ |
 | n of parameters | 242 M | 205 M |
+| n of parameters (with proj_out layer) | 281 M | 208 M |
+| model file size | 967 Mb | 834 Mb |
+| vocab_size | 51865 | 4207 |
 ## Usage
 Model can be used as an original whisper: