How to use it like official model, with wordstamp to generate srt?
#2
by
xiaohe384
- opened
Sorry, I got no clue for this in doc.
You could use the transformers pipeline with timestamps argument on word level.
Check out the transformers docs for the automatic speech recognition pipeline for more details
Hi!
I am trying something similar and I am getting an error when setting the pipeline return_timestamps=True argument.
ValueError: You are trying to return timestamps, but the generation config is not properly set. Make sure to initialize the generation config with the correct attributes that are needed such as `no_timestamps_token_id`. For more details on how to generate the approtiate config, refer to https://github.com/huggingface/transformers/issues/21878#issuecomment-1451902363
Do you have any idea why this could be happening?
Edit: Sorry wrong alarm, just had to omit the language parameter I was still passing, which is obviously not needed here.