Improve suggestion

#2
by LukeJacob2023 - opened

Hello, antony66, thanks for your contribute of this model. I am trying to improve russian whisper too. In my option, finetune on common voice will not improve much, because I think the original whisper have collect the dataset and train. I think we need some accurate dataset. Generally after finetune, the wer can reduce 50%.

This model improve much in it's dataset. I think original whisper may has not train it.
https://github.com/sovse/base_rus_whisper_stt

And original whisper may not use dataset:
https://github.com/snakers4/open_stt

Hey! Yes this is exactly my thought as well. So I had been working on a custom dataset until I had to switch to another task temporarily. Looking forward to returning to this task soon

This comment has been hidden
LukeJacob2023 changed discussion status to closed
LukeJacob2023 changed discussion status to open

Sign up or log in to comment