đ Speaker Segmentation
This open - source model focuses on speaker segmentation, leveraging the power of pyannote.audio 2.1. It offers effective solutions for tasks related to audio processing such as speaker segmentation, speaker diarization, and more. If you're using this model in production, consider pyannoteAI for better and faster options.
đ Quick Start
Relies on pyannote.audio 2.1: see installation instructions.
from pyannote.audio import Pipeline
pipeline = Pipeline.from_pretrained("pyannote/speaker-segmentation")
output = pipeline("audio.wav")
for turn, _, speaker in output.itertracks(yield_label=True):
...
â ī¸ Important Note
This pipeline does not address speaker diarization.
đ Documentation
Tags
- pyannote
- pyannote-audio
- pyannote-audio-pipeline
- audio
- voice
- speech
- speaker
- speaker-segmentation
- speaker-diarization
- speaker-change-detection
- voice-activity-detection
- overlapped-speech-detection
- automatic-speech-recognition
Datasets
Gated Access
The collected information will help acquire a better knowledge of pyannote.audio userbase and help its maintainers apply for grants to improve it further. If you are an academic researcher, please cite the relevant papers in your own publications using the model. If you work for a company, please consider contributing back to pyannote.audio development (e.g. through unrestricted gifts). We also provide scientific consulting services around speaker diarization and machine listening.
Property |
Details |
Company/university |
text |
Website |
text |
I plan to use this model for (task, type of audio data, etc) |
text |
đ License
This project is licensed under the MIT license.
Support
For commercial enquiries and scientific consulting, please contact me.
For technical questions and bug reports, please check pyannote.audio Github repository.
Citation
@inproceedings{Bredin2021,
Title = {{End-to-end speaker segmentation for overlap-aware resegmentation}},
Author = {{Bredin}, Herv{\'e} and {Laurent}, Antoine},
Booktitle = {Proc. Interspeech 2021},
Address = {Brno, Czech Republic},
Month = {August},
Year = {2021},
@inproceedings{Bredin2020,
Title = {{pyannote.audio: neural building blocks for speaker diarization}},
Author = {{Bredin}, Herv{\'e} and {Yin}, Ruiqing and {Coria}, Juan Manuel and {Gelly}, Gregory and {Korshunov}, Pavel and {Lavechin}, Marvin and {Fustes}, Diego and {Titeux}, Hadrien and {Bouaziz}, Wassim and {Gill}, Marie-Philippe},
Booktitle = {ICASSP 2020, IEEE International Conference on Acoustics, Speech, and Signal Processing},
Address = {Barcelona, Spain},
Month = {May},
Year = {2020},
}