Build a demo with Gradio
Now that we’ve fine-tuned a Whisper model for Dhivehi speech recognition, let’s go ahead and build a Gradio demo to showcase it to the community!
The first thing to do is load up the fine-tuned checkpoint using the pipeline()
class - this is very familiar now from
the section on pre-trained models. You can change the model_id
to the namespace of your fine-tuned
model on the Hugging Face Hub, or one of the pre-trained Whisper models
to perform zero-shot speech recognition:
from transformers import pipeline
model_id = "sanchit-gandhi/whisper-small-dv" # update with your model id
pipe = pipeline("automatic-speech-recognition", model=model_id)
Secondly, we’ll define a function that takes the filepath for an audio input and passes it through the pipeline. Here, the pipeline automatically takes care of loading the audio file, resampling it to the correct sampling rate, and running inference with the model. We can then simply return the transcribed text as the output of the function. To ensure our model can handle audio inputs of arbitrary length, we’ll enable chunking as described in the section on pre-trained models:
def transcribe_speech(filepath):
output = pipe(
filepath,
max_new_tokens=256,
generate_kwargs={
"task": "transcribe",
"language": "sinhalese",
}, # update with the language you've fine-tuned on
chunk_length_s=30,
batch_size=8,
)
return output["text"]
We’ll use the Gradio blocks feature to launch two tabs on our demo: one for microphone transcription, and the other for file upload.
import gradio as gr
demo = gr.Blocks()
mic_transcribe = gr.Interface(
fn=transcribe_speech,
inputs=gr.Audio(sources="microphone", type="filepath"),
outputs=gr.components.Textbox(),
)
file_transcribe = gr.Interface(
fn=transcribe_speech,
inputs=gr.Audio(sources="upload", type="filepath"),
outputs=gr.components.Textbox(),
)
Finally, we launch the Gradio demo using the two blocks that we’ve just defined:
with demo:
gr.TabbedInterface(
[mic_transcribe, file_transcribe],
["Transcribe Microphone", "Transcribe Audio File"],
)
demo.launch(debug=True)
This will launch a Gradio demo similar to the one running on the Hugging Face Space:
Should you wish to host your demo on the Hugging Face Hub, you can use this Space as a template for your fine-tuned model.
Click the link to duplicate the template demo to your account: https://huggingface.co/spaces/course-demos/whisper-small?duplicate=true
We recommend giving your space a similar name to your fine-tuned model (e.g. whisper-small-dv-demo) and setting the visibility to “Public”.
Once you’ve duplicated the Space to your account, click “Files and versions” -> “app.py” -> “edit”. Then change the model identifier to your fine-tuned model (line 6). Scroll to the bottom of the page and click “Commit changes to main”. The demo will reboot, this time using your fine-tuned model. You can share this demo with your friends and family so that they can use the model that you’ve trained!
Checkout our video tutorial to get a better understanding of how to duplicate the Space 👉️ YouTube Video
We look forward to seeing your demos on the Hub!
< > Update on GitHub