S

Speaker Diarization 3.0

Developed by pyannote
Speaker diarization pipeline trained on pyannote.audio 3.0.0, supporting automatic voice activity detection, speaker change detection and overlapping speech detection
Downloads 463.91k
Release Time : 9/22/2023

Model Overview

This model is used for speaker diarization tasks in audio, capable of automatically identifying different speakers and their active time segments in audio, supporting 16kHz sampled mono audio processing.

Model Features

Automatic voice activity detection
No manual voice activity detection required, the model automatically identifies speech activity
Automatic speaker count inference
Can automatically infer the number of speakers in audio, also supports manual specification
Overlapping speech processing
Capable of detecting and processing speech segments with overlapping speakers
Multi-dataset training
Trained on multiple datasets including AISHELL, AliMeeting, AMI, etc., with broad applicability

Model Capabilities

Speaker diarization
Voice activity detection
Speaker change detection
Overlapping speech detection
Automatic speaker counting

Use Cases

Meeting transcription
Meeting transcription speaker diarization
Automatically identify different speakers and their speaking times in meeting recordings
DER 12.3% (AISHELL-4 dataset)
Speech analysis
Multi-speaker speech analysis
Analyze audio files containing multiple speakers, identifying active time segments for each speaker
DER 19.0% (AMI dataset)
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase