S

Speaker Diarization

Developed by pyannote
Speaker diarization model based on pyannote.audio 2.1.1, used for automatic detection of speaker changes and overlap speech in audio
Downloads 910.93k
Release Time : 3/2/2022

Model Overview

This model is an end-to-end speaker diarization pipeline that can automatically detect speaker changes, identify overlap speech, and complete segmentation tasks without manually specifying the number of speakers.

Model Features

Fully automatic processing
Completes segmentation without manual voice activity detection or specifying the number of speakers
Overlap speech detection
Accurately identifies and processes speech segments with overlapping speakers
Speaker count adaptation
Automatically adapts to different numbers of speakers, also supports manually specifying speaker count range
High performance
Excellent performance on multiple benchmark datasets, with a real-time factor of approximately 2.5%

Model Capabilities

Speaker diarization
Speaker change detection
Voice activity detection
Overlap speech detection
Automatic speech recognition assistance

Use Cases

Meeting transcription
Meeting transcription speaker diarization
Automatically identifies speech segments from different speakers in meeting recordings
DER of 18.91% on AMI dataset
Media analysis
Broadcast program speaker analysis
Analyzes speaker changes and overlap situations in broadcast programs
DER of 20.82% on This American Life dataset
Speech recognition preprocessing
ASR system preprocessing
Provides speaker diarization information for automatic speech recognition systems
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase