File size: 1,211 Bytes
3f56d20 15de6e8 b9c0e02 c58e1c8 764b020 3f56d20 15de6e8 3f56d20 764b020 3f56d20 15de6e8 3f56d20 c58e1c8 3f56d20 15de6e8 3f56d20 c58e1c8 3f56d20 15de6e8 3f56d20 c58e1c8 3f56d20 15de6e8 3f56d20 15de6e8 3f56d20 15de6e8 b9c0e02 15de6e8 b9c0e02 15de6e8 3f56d20 15de6e8 3f56d20 15de6e8 c58e1c8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
---
model-index:
- name: mHuBERT-147-br
results: []
language:
- br
metrics:
- wer
pipeline_tag: automatic-speech-recognition
---
# mHuBERT-147-br
This model was fine-tuned on Mozilla Common Voice 15 Breton dataset and [Roadennoù](https://github.com/gweltou/roadennou) dataset.
## Model description
This model was trained to assess the performance mHubert-147 for finetuning a Breton ASR model.
## Intended uses & limitations
This model is a research model and shouldn't be used in production.
## Training and evaluation data
90% of the Roadennoù dataset was used for training, the remaining 10% was used for validation in addition to MCV15-br validation dataset.
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3.8e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 52
- mixed_precision_training: Native AMP
### Framework versions
- Transformers 4.39.1
- Pytorch 2.0.1+cu117
- Datasets 2.18.0
- Tokenizers 0.15.2 |