boumehdi commited on
Commit
f72f5af
1 Parent(s): cf16445

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -4
README.md CHANGED
@@ -17,7 +17,7 @@ model-index:
17
  metrics:
18
  - name: Test WER
19
  type: wer
20
- value: 44.30
21
  ---
22
  # Wav2Vec2-Large-XLSR-53-Moroccan-Darija
23
 
@@ -61,7 +61,21 @@ Here's the output: ڭالت ليا هاد السيد هادا ما كاينش ب
61
 
62
  ## Evaluation & Previous works
63
 
64
- ==================================================================================
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65
 
66
  -v2 (fine-tuned on 9 hours of audio + replaced أ and ى and إ with ا as it creates a lot of problems + tried to standardize the Moroccan Darija)
67
 
@@ -77,7 +91,7 @@ The validation loss is still high also because the validation data contains word
77
 
78
  Further training to decrease the training Loss makes this model overfit a little bit.
79
 
80
- ==================================================================================
81
 
82
  -v1 (fine-tuned on 6 hours of audio)
83
 
@@ -87,7 +101,7 @@ Further training to decrease the training Loss makes this model overfit a little
87
 
88
  **Validation Loss**: 45.24
89
 
90
- ==================================================================================
91
 
92
  ## Future Work
93
 
 
17
  metrics:
18
  - name: Test WER
19
  type: wer
20
+ value: 23.44
21
  ---
22
  # Wav2Vec2-Large-XLSR-53-Moroccan-Darija
23
 
 
61
 
62
  ## Evaluation & Previous works
63
 
64
+ ====================================
65
+
66
+ -v3 (fine-tuned on 10 hours of audio + changed hyperparameters + discovered a huge bug when using the letter ا)
67
+
68
+ **Wer**: 23.44
69
+
70
+ **Training Loss**: 15.96
71
+
72
+ **Validation Loss**: 33.92
73
+
74
+ The validation loss is still high also because the validation data contains words that have never been trained before. The solution is to add more data and more hours of training.
75
+
76
+ Further training to decrease the training Loss makes this model overfit a little bit.
77
+
78
+ ====================================
79
 
80
  -v2 (fine-tuned on 9 hours of audio + replaced أ and ى and إ with ا as it creates a lot of problems + tried to standardize the Moroccan Darija)
81
 
 
91
 
92
  Further training to decrease the training Loss makes this model overfit a little bit.
93
 
94
+ ====================================
95
 
96
  -v1 (fine-tuned on 6 hours of audio)
97
 
 
101
 
102
  **Validation Loss**: 45.24
103
 
104
+ ====================================
105
 
106
  ## Future Work
107