pere commited on
Commit
02c09bb
1 Parent(s): 3911b06

updated template

Browse files
Files changed (1) hide show
  1. README.md +26 -0
README.md CHANGED
@@ -230,6 +230,32 @@ $ ./main -l no -m models/nb-large-ggml-model.bin king.wav
230
  $ ./main -l no -m models/nb-large-ggml-model-q5_0.bin king.wav
231
  ```
232
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
233
  ### API
234
  Instructions for accessing the models via a simple API are included in the demos under Spaces. Note that these demos are temporary and will only be available for a few weeks.
235
 
 
230
  $ ./main -l no -m models/nb-large-ggml-model-q5_0.bin king.wav
231
  ```
232
 
233
+ ### WhisperX and Speaker Diarization
234
+ Speaker diarization is a technique in natural language processing and automatic speech recognition that identifies and separates different speakers in an audio recording. It segments the audio into parts based on who is speaking, enhancing the quality of transcribing meetings or phone calls. We find that [WhisperX](https://github.com/m-bain/whisperX) is the easiest way to use our models for diarizing speech. In addition, WhisperX is using phoneme-based Wav2Vec-models for improving the alignment of the timestamps. As of December 2023 it also has native support for using the nb-wav2vec-models. It currently uses [PyAnnote-audio](https://github.com/pyannote/pyannote-audio) for doing the actual diarization. This package has a fairly strict licence where you have to agree to user terms. Follow the instructions below.
235
+
236
+ ```bash
237
+ # Follow the install instructions on https://github.com/m-bain/whisperX
238
+ # Make sure you have a HuggingFace account and have agreed to the pyannote terms
239
+
240
+ # Log in (or supply HF Token in command line)
241
+ huggingface-cli login
242
+
243
+ # Download a test file
244
+ wget -N https://github.com/NbAiLab/nb-whisper/raw/main/audio/knuthamsun.mp3
245
+
246
+ # Optional. If you get complains about not support for Norwegian, do:
247
+ pip uninstall whisperx && pip install git+https://github.com/m-bain/whisperx.git@8540ff5985fceee764acbed94f656063d7f56540
248
+
249
+ # Transcribe the test file. All transcripts will end up in the directory of the mp3-file
250
+ whisperx knuthamsun.mp3 --model NbAiLabBeta/nb-whisper-large --language no --diarize
251
+
252
+ ```
253
+
254
+ You can also run WhisperX from Python. Please take a look at the instructions on [WhisperX homepage](https://github.com/m-bain/whisperX).
255
+
256
+
257
+
258
+
259
  ### API
260
  Instructions for accessing the models via a simple API are included in the demos under Spaces. Note that these demos are temporary and will only be available for a few weeks.
261