M9and2M commited on
Commit
61e2cf6
1 Parent(s): 961f14d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -1
README.md CHANGED
@@ -7,4 +7,70 @@ language:
7
  metrics:
8
  - wer
9
  pipeline_tag: automatic-speech-recognition
10
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  metrics:
8
  - wer
9
  pipeline_tag: automatic-speech-recognition
10
+ ---
11
+
12
+ # Wolof ASR Model (Based on Whisper-Small)
13
+
14
+ ## Model Overview
15
+
16
+ This repository hosts an Automatic Speech Recognition (ASR) model for the Wolof language, fine-tuned from OpenAI's Whisper-small model. This model aims to provide accurate transcription of Wolof audio data.
17
+
18
+ ## Model Details
19
+
20
+ - **Model Base**: Whisper-small
21
+ - **Loss**: 0.123
22
+ - **WER**: 0.17
23
+
24
+
25
+ ## Dataset
26
+
27
+ The dataset used for training and evaluating this model is a collection from various sources, ensuring a rich and diverse set of Wolof audio samples. The collection is available in my Hugging Face account is used by keeping only the audios with duration shorter than 6 second.
28
+
29
+ - **Training Dataset**: 57 hours
30
+ - **Test Dataset**: 10 hours
31
+
32
+ For detailed information about the dataset, please refer to the [M9and2M/Wolof_ASR_dataset](https://huggingface.co/datasets/M9and2M/Wolof_ASR_dataset).
33
+
34
+ ## Training
35
+
36
+ The training process was adapted from the code in the [Finetune Wa2vec 2.0 For Speech Recognition](https://github.com/khanld/ASR-Wa2vec-Finetune) written to fine-tune Wav2Vec2.0 for speech recognition. Special thanks to the author, Duy Khanh, Le for providing a robust and flexible training framework.
37
+
38
+ The model was trained with the following configuration:
39
+
40
+ - **Seed**: 19
41
+ - **Training Batch Size**: 1
42
+ - **Gradient Accumulation Steps**: 8
43
+ - **Number of GPUs**: 2
44
+
45
+ ### Optimizer : AdamW
46
+
47
+ - **Learning Rate**: 1e-7
48
+
49
+ ### Scheduler: OneCycleLR
50
+
51
+ - **Max Learning Rate**: 5e-5
52
+
53
+ ## Acknowledgements
54
+ This model was built using Facebook's [Wav2Vec2.0](https://huggingface.co/facebook/wav2vec2-large-xlsr-53) architecture and fine-tuned with a dataset collected from various sources. Special thanks to the creators and contributors of the dataset.
55
+
56
+
57
+ <!-- ## Citation [optional]
58
+
59
+ <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
60
+
61
+ <!-- **BibTeX:** -->
62
+
63
+ <!-- [More Information Needed] -->
64
+
65
+ <!-- **APA:** -->
66
+
67
+
68
+
69
+ ## More Information
70
+
71
+ This model has been developed in the context of my Master Thesis at ETSIT-UPM, Madrid under the supervision of Prof. Luis A. Hernández Gómez.
72
+
73
+
74
+ ## Contact
75
+
76
+ For any inquiries or questions, please contact [email protected]