Model Selection

Speaker diarization

# Speaker diarization

Speaker Diarization 2.5

A speaker diarization model modified based on pyannote/speaker-diarization-3.0, using speechbrain/spkrec-ecapa-voxceleb for speaker embedding, with better performance in certain tests

Speaker Analysis

Segmentation 3.0

This is an audio segmentation model capable of detecting speaker changes, voice activity, and overlapping speech, suitable for audio analysis in multi-speaker scenarios.

Audio Processing

Kotoba Whisper V2.2

Japanese automatic speech recognition model based on Whisper, integrating speaker separation and punctuation addition functions

Speech Recognition

Transformers Japanese

Speaker Segmentation Fine Tuned Callhome Jpn

This is a speaker diarization model fine-tuned from the pyannote/segmentation-3.0 base model, specifically optimized for Japanese telephone conversation scenarios.

Speaker Analysis

Pyannote Segmentation 30

This is an audio processing model for speaker diarization, capable of detecting speech activity, overlapping speech, and multiple speakers.

Audio Processing

Speaker Diarization Optimized

The speaker diarization pipeline of Pyannote.audio, used to automatically detect speaker changes in audio and segment speech segments.

Speaker Analysis

Segmentation 3.0

This is a powerset-encoded speaker diarization model capable of processing 10-second audio clips to identify multiple speakers and their overlapping speech.

Speaker Analysis

Pyannote Segmentation

This is an end-to-end speaker diarization model that supports voice activity detection, overlap speech detection, and resegmentation tasks.

Speaker Analysis

Pyannote Speaker Diarization Endpoint

Speaker diarization model based on pyannote.audio 2.0 for automatic detection of speaker changes and speech activity in audio

Speaker Analysis

Speaker Diarization

Speaker diarization model based on pyannote.audio 2.1.1, used for automatic detection of speaker changes and overlap speech in audio

Speaker Analysis

Overlapped Speech Detection

A pre-trained model for detecting overlapped speech in audio, capable of identifying time segments where two or more speakers are active simultaneously.

Speaker Analysis

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase