Erkhembayar Gantulga commited on
Commit
6036d4e
1 Parent(s): a35f002

Updated README

Browse files

Specified training datasets

Files changed (1) hide show
  1. README.md +39 -9
README.md CHANGED
@@ -1,15 +1,21 @@
1
  ---
2
  language:
3
  - mn
4
- license: apache-2.0
5
  base_model: openai/whisper-small
6
  tags:
7
- - generated_from_trainer
 
 
8
  metrics:
9
  - wer
10
  model-index:
11
  - name: Whisper Small Mn - Erkhembayar Gantulga
12
  results: []
 
 
 
 
 
13
  ---
14
 
15
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -22,19 +28,43 @@ It achieves the following results on the evaluation set:
22
  - Loss: 0.1561
23
  - Wer: 19.4492
24
 
25
- ## Model description
26
 
27
- More information needed
 
 
28
 
29
- ## Intended uses & limitations
30
 
31
- More information needed
 
 
32
 
33
- ## Training and evaluation data
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
 
35
- More information needed
 
 
 
36
 
37
- ## Training procedure
 
 
 
38
 
39
  ### Training hyperparameters
40
 
 
1
  ---
2
  language:
3
  - mn
 
4
  base_model: openai/whisper-small
5
  tags:
6
+ - audio
7
+ - automatic-speech-recognition
8
+ library_name: transformers
9
  metrics:
10
  - wer
11
  model-index:
12
  - name: Whisper Small Mn - Erkhembayar Gantulga
13
  results: []
14
+ datasets:
15
+ - mozilla-foundation/common_voice_17_0
16
+ - google/fleurs
17
+ pipeline_tag: automatic-speech-recognition
18
+ license: apache-2.0
19
  ---
20
 
21
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
28
  - Loss: 0.1561
29
  - Wer: 19.4492
30
 
31
+ ## Training and evaluation data
32
 
33
+ Datasets used for training:
34
+ - [Common Voice 17.0](https://huggingface.co/datasets/mozilla-foundation/common_voice_17_0)
35
+ - [Google Fleurs](https://huggingface.co/datasets/google/fleurs)
36
 
37
+ For training, combined Common Voice 17.0 and Google Fleurs datasets:
38
 
39
+ ```
40
+ from datasets import load_dataset, DatasetDict, concatenate_datasets
41
+ from datasets import Audio
42
 
43
+ common_voice = DatasetDict()
44
+
45
+ common_voice["train"] = load_dataset("mozilla-foundation/common_voice_17_0", "mn", split="train+validation+validated", use_auth_token=True)
46
+ common_voice["test"] = load_dataset("mozilla-foundation/common_voice_17_0", "mn", split="test", use_auth_token=True)
47
+
48
+ common_voice = common_voice.cast_column("audio", Audio(sampling_rate=16000))
49
+
50
+ common_voice = common_voice.remove_columns(
51
+ ["accent", "age", "client_id", "down_votes", "gender", "locale", "path", "segment", "up_votes", "variant"]
52
+ )
53
+
54
+ google_fleurs = DatasetDict()
55
+
56
+ google_fleurs["train"] = load_dataset("google/fleurs", "mn_mn", split="train+validation", use_auth_token=True)
57
+ google_fleurs["test"] = load_dataset("google/fleurs", "mn_mn", split="test", use_auth_token=True)
58
 
59
+ google_fleurs = google_fleurs.remove_columns(
60
+ ["id", "num_samples", "path", "raw_transcription", "gender", "lang_id", "language", "lang_group_id"]
61
+ )
62
+ google_fleurs = google_fleurs.rename_column("transcription", "sentence")
63
 
64
+ dataset = DatasetDict()
65
+ dataset["train"] = concatenate_datasets([common_voice["train"], google_fleurs["train"]])
66
+ dataset["test"] = concatenate_datasets([common_voice["test"], google_fleurs["test"]])
67
+ ```
68
 
69
  ### Training hyperparameters
70