Update README.md
Browse files
README.md
CHANGED
@@ -41,7 +41,7 @@ More information needed
|
|
41 |
|
42 |
## Training and evaluation data
|
43 |
```python
|
44 |
-
# datasets for each language
|
45 |
common_voice_train_uz = load_dataset("mozilla-foundation/common_voice_16_1", "uz", split='train', trust_remote_code=True, token=env('HUGGING_TOKEN'), streaming=True)
|
46 |
common_voice_train_ru = load_dataset("mozilla-foundation/common_voice_16_1", "ru", split='train', trust_remote_code=True, token=env('HUGGING_TOKEN'), streaming=True)
|
47 |
common_voice_train_en = load_dataset("mozilla-foundation/common_voice_16_1", "en", split='train', trust_remote_code=True, token=env('HUGGING_TOKEN'), streaming=True)
|
@@ -57,7 +57,7 @@ common_voice['train'] = concatenate_datasets([common_voice_train_uz, common_voic
|
|
57 |
## Training procedure
|
58 |
|
59 |
Used Trainer from transformers.
|
60 |
-
Training and evaluation process are described in the notebook, storing in the following github repository:
|
61 |
|
62 |
https://github.com/fitlemon/whisper-small-uz-en-ru-lang-id
|
63 |
|
|
|
41 |
|
42 |
## Training and evaluation data
|
43 |
```python
|
44 |
+
# datasets for each language from the set {uz: Uzbek, en: English, ru: Russian}
|
45 |
common_voice_train_uz = load_dataset("mozilla-foundation/common_voice_16_1", "uz", split='train', trust_remote_code=True, token=env('HUGGING_TOKEN'), streaming=True)
|
46 |
common_voice_train_ru = load_dataset("mozilla-foundation/common_voice_16_1", "ru", split='train', trust_remote_code=True, token=env('HUGGING_TOKEN'), streaming=True)
|
47 |
common_voice_train_en = load_dataset("mozilla-foundation/common_voice_16_1", "en", split='train', trust_remote_code=True, token=env('HUGGING_TOKEN'), streaming=True)
|
|
|
57 |
## Training procedure
|
58 |
|
59 |
Used Trainer from transformers.
|
60 |
+
Training and evaluation process are described in the Jupyter notebook, storing in the following github repository:
|
61 |
|
62 |
https://github.com/fitlemon/whisper-small-uz-en-ru-lang-id
|
63 |
|