Hi,

Many thanks for extending mHuBERT-147 to Breton! My many French colleagues were super hyped about it! :)
I'm doing this PR to request the addition of "base_model: utter-project/mHuBERT-147" on your readme header, so that your fine-tuned model is advertised as "finetunes" of mHuBERT-147 (see here https://huggingface.co/models?other=base_model:finetune:utter-project/mHuBERT-147).

All the best,

Owner

Absolutely !
I'm still in the process of reviewing different architectures by training models with only 10% of my data, so the breton mHubert-147 will get even better soon !
I've found the performance of mHubert-147 to be slightly below wav2vec2-xls-r-300m after fine-tuning on the same breton dataset, but it is still much more efficient parameter-wise, and its small size makes it ideal for fast research.
Thank you for the great work !
"Trugarez", in breton ;)

gweltou changed pull request status to merged

Great to hear! I'll keep an eye for more mHuBERT-147 Breton models.

For boosting your mHuBERT-147 results, you can try playing with the dropout. Moreover, you can also add some hidden layers before the lm head to give the model a little bit of capacity. I found that to be very helpful for quick ASR convergence.

I recently published a blog with some custom code for ASR fine-tuning, you can check it here: https://huggingface.co/blog/mzboito/naver-demo-french-slu#1-building-a-french-asr-model-using-mhubert-147

Have a great day! :)

Sign up or log in to comment