README / README.md
hbredin's picture
Update README.md
a0d5b10 verified
metadata
title: README
emoji: πŸš€
colorFrom: yellow
colorTo: green
sdk: static
pinned: false

pyannote.audio is an open-source toolkit for speaker diarization.

Pretrained pipelines reach state-of-the-art performance on most academic benchmarks.

Using it in production?
Consider switching to pyannoteAI for better and faster options.

Benchmark v2.1 v3.1 pyannoteAI
AISHELL-4 14.1 12.2 11.2
AliMeeting (channel 1) 27.4 24.4 19.3
AMI (IHM) 18.9 18.8 15.8
AMI (SDM) 27.1 22.4 19.3
AVA-AVD 66.3 50.0 44.8
CALLHOME (part 2) 31.6 28.4 19.8
DIHARD 3 (full) 26.9 21.7 16.8
Earnings21 17.0 9.4 9.1
Ego4D (dev.) 61.5 51.2 44.0
MSDWild 32.8 25.3 19.8
RAMC 22.5 22.2 11.1
REPERE (phase2) 8.2 7.8 7.6
VoxConverse (v0.3) 11.2 11.3 9.8
Diarization error rate (in %)

Using high-end NVIDIA hardware,

  • v2.1 takes around 1m30s to process 1h of audio
  • v3.1 takes around 1m20s to process 1h of audio
  • On-premise pyannoteAI takes less than 30s to process 1h of audio