erkhem-gantulga
/

whisper-small-mn

Automatic Speech Recognition

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Erkhembayar Gantulga commited on Aug 27

Commit

6036d4e

•

1 Parent(s): a35f002

Updated README

Specified training datasets

Files changed (1) hide show

README.md +39 -9

README.md CHANGED Viewed

@@ -1,15 +1,21 @@
 ---
 language:
 - mn
-license: apache-2.0
 base_model: openai/whisper-small
 tags:
-- generated_from_trainer
 metrics:
 - wer
 model-index:
 - name: Whisper Small Mn - Erkhembayar Gantulga
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -22,19 +28,43 @@ It achieves the following results on the evaluation set:
 - Loss: 0.1561
 - Wer: 19.4492
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
 ### Training hyperparameters

 ---
 language:
 - mn
 base_model: openai/whisper-small
 tags:
+- audio
+- automatic-speech-recognition
+library_name: transformers
 metrics:
 - wer
 model-index:
 - name: Whisper Small Mn - Erkhembayar Gantulga
   results: []
+datasets:
+- mozilla-foundation/common_voice_17_0
+- google/fleurs
+pipeline_tag: automatic-speech-recognition
+license: apache-2.0
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 - Loss: 0.1561
 - Wer: 19.4492
+## Training and evaluation data
+Datasets used for training:
+- [Common Voice 17.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_17_0)
+- [Google Fleurs](https://huggingface.co/datasets/google/fleurs)
+For training, combined Common Voice 17.0 and Google Fleurs datasets:
+```
+from datasets import load_dataset, DatasetDict, concatenate_datasets
+from datasets import Audio
+common_voice = DatasetDict()
+common_voice["train"] = load_dataset("mozilla-foundation/common_voice_17_0", "mn", split="train+validation+validated", use_auth_token=True)
+common_voice["test"] = load_dataset("mozilla-foundation/common_voice_17_0", "mn", split="test", use_auth_token=True)
+common_voice = common_voice.cast_column("audio", Audio(sampling_rate=16000))
+common_voice = common_voice.remove_columns(
+    ["accent", "age", "client_id", "down_votes", "gender", "locale", "path", "segment", "up_votes", "variant"]
+)
+google_fleurs = DatasetDict()
+google_fleurs["train"] = load_dataset("google/fleurs", "mn_mn", split="train+validation", use_auth_token=True)
+google_fleurs["test"] = load_dataset("google/fleurs", "mn_mn", split="test", use_auth_token=True)
+google_fleurs = google_fleurs.remove_columns(
+    ["id", "num_samples", "path", "raw_transcription", "gender", "lang_id", "language", "lang_group_id"]
+)
+google_fleurs = google_fleurs.rename_column("transcription", "sentence")
+dataset = DatasetDict()
+dataset["train"] = concatenate_datasets([common_voice["train"], google_fleurs["train"]])
+dataset["test"] = concatenate_datasets([common_voice["test"], google_fleurs["test"]])
+```
 ### Training hyperparameters