pyannote.audio is an open-source toolkit addressing the problem of speaker diarization (
aka the "who speaks when?" problem).
Pretrained pyannote models are among the most popular models on the Huggingface model hub: they allow to obtain state-of-the-art results in 3 lines of Python code:
>>> from pyannote.audio import Pipeline
>>> pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization")
>>> who_speaks_when = pipeline("audio.wav")
While pyannote toolkit will always remain open-source, we are considering selling premium models, extensions, or services around the pyannote ecosystem.
2 minutes of your time to fill this survey would mean a lot to us!