Whisper Large v3 model

#9
by Robis - opened

@aadnk

Hi!

Seems like the new faster-whisper large v3 is available

https://huggingface.co/Purfview/faster-whisper-large-v3-int8

I'm waiting for it to become available in the main repository of faster-whisper:

Though I might try some of the known workarounds for getting large-v3 to run.

Oh ok, I didn't know which one is the official one ๐Ÿ˜… Saw this one in Subtitle Editor issue

Can you add support for this?
https://github.com/Vaibhavs10/insanely-fast-whisper

Seems like it's possible, the problem is that it doesn't seem to be faster when I use it in conjunction with the VAD:

I'll have to look more into it.

faster-whisper officially supports v3 model.
https://github.com/SYSTRAN/faster-whisper/releases/tag/0.10.0

faster-whisper officially supports v3 model.
https://github.com/SYSTRAN/faster-whisper/releases/tag/0.10.0

Most importantly is the bit where it says "Change the hub to fetch models from Systran organization"

Right now, it kinda "just works" if you add the below model string to config.json5

        {
            "name": "large-v3",
            "url": "Systran/faster-whisper-large-v3",
            "type": "huggingface"
        },

Faster-whisper now officially supports larger-v3. ๐Ÿ‘

i can't use large-v3 model in faster-whisper-webui.
but i can use large-v3 model in whisper-webui.

How can I use the large-v3 model only in the faster-whisper web UI?

@k-ta Have you updated faster-whisper-webui to the latest version? If not, use the command git pull origin if you have checked it out with Git, or download the repository again. In particular, check if config.json5 is the same as on huggingface.co.

@aadnk Yes. The config.json5 file seems to be the latest version without any issues.
but i got this error.

Loading faster whisper model large-v3 for device None
Deleting source file C:\Users\north\AppData\Local\Temp\gradio\7999992449a827e2d3f1c3ee2f39df8e78c93fd1\V_20191003_094040_vHDR_On.mp4
Traceback (most recent call last):
File "E:\AI\whisper\venv\lib\site-packages\gradio\routes.py", line 442, in run_predict
output = await app.get_blocks().process_api(
File "E:\AI\whisper\venv\lib\site-packages\gradio\blocks.py", line 1389, in process_api
result = await self.call_function(
File "E:\AI\whisper\venv\lib\site-packages\gradio\blocks.py", line 1094, in call_function
prediction = await anyio.to_thread.run_sync(
File "E:\AI\whisper\venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "E:\AI\whisper\venv\lib\site-packages\anyio_backends_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "E:\AI\whisper\venv\lib\site-packages\anyio_backends_asyncio.py", line 867, in run
result = context.run(func, *args)
File "E:\AI\whisper\venv\lib\site-packages\gradio\utils.py", line 703, in wrapper
response = f(*args, **kwargs)
File "E:\AI\whisper\faster-whisper-webui\app.py", line 132, in transcribe_webui_simple_progress return self.transcribe_webui(modelName, languageName, urlData, multipleFiles, microphoneData, task, vadOptions,
File "E:\AI\whisper\faster-whisper-webui\app.py", line 280, in transcribe_webui
result = self.transcribe_file(model, source.source_path, selectedLanguage, task, vadOptions, scaled_progress_listener, **decodeOptions)
File "E:\AI\whisper\faster-whisper-webui\app.py", line 383, in transcribe_file
result = self.process_vad(audio_path, whisperCallable, self.vad_model, process_gaps, progressListener=progressListener)
File "E:\AI\whisper\faster-whisper-webui\app.py", line 451, in process_vad
return vadModel.transcribe(audio_path, whisperCallable, vadConfig, progressListener=progressListener)
File "E:\AI\whisper\faster-whisper-webui\src\vad.py", line 213, in transcribe
segment_result = whisperCallable.invoke(segment_audio, segment_index, segment_prompt, detected_language, progress_listener=scaled_progress_listener)
File "E:\AI\whisper\faster-whisper-webui\src\whisper\fasterWhisperContainer.py", line 110, in invoke
model: WhisperModel = self.model_container.get_model()
File "E:\AI\whisper\faster-whisper-webui\src\whisper\abstractWhisperContainer.py", line 63, in get_model
self.model = self.cache.get(model_key, self._create_model)
File "E:\AI\whisper\faster-whisper-webui\src\modelCache.py", line 9, in get
result = model_factory()
File "E:\AI\whisper\faster-whisper-webui\src\whisper\fasterWhisperContainer.py", line 57, in _create_model
model = WhisperModel(model_url, device=device, compute_type=self.compute_type)
File "E:\AI\whisper\venv\lib\site-packages\faster_whisper\transcribe.py", line 114, in init model_path = download_model(
File "E:\AI\whisper\venv\lib\site-packages\faster_whisper\utils.py", line 60, in download_model raise ValueError(
ValueError: Invalid model size 'large-v3', expected one of: tiny.en, tiny, base.en, base, small.en, small, medium.en, medium, large-v1, large-v2

What should I do to resolve this issue?

I was able to resolve it myself. It seems that the error occurred because I hadn't installed cudnn. After installing it, the program is now running smoothly. I apologize for the inconvenience caused.

@k-ta : The error above complains about large-v3 not being a valid model:

File "E:\AI\whisper\venv\lib\site-packages\faster_whisper\utils.py", line 60, in download_model raise ValueError(
ValueError: Invalid model size 'large-v3', expected one of: tiny.en, tiny, base.en, base, small.en, small, medium.en, medium, large-v1, large-v2

This has since been fixed in the latest version of faster-whisper, so it should be sufficient to just update the repository.

Presumably, when you installed cudnn you perhaps also updated the repository, which probably fixed the issue. Either way, I'm glad it works now. ๐Ÿ‘

Sign up or log in to comment