# Multi-speaker separation

Diar Sortformer 4spk V1
An end-to-end speaker diarization model based on the Sortformer architecture, which resolves permutation issues in diarization by ordering speech segments according to speaker arrival time, supporting recognition of up to 4 speakers.
Audio Processing
D
nvidia
385.49k
36
Wsj0 2mix Skim Small Causal
This is a speech enhancement model trained based on the ESPnet framework, specifically designed for speech separation tasks in the wsj0_2mix dataset.
Audio Enhancement English
W
lichenda
26
1
Sepformer Libri3mix
Apache-2.0
This is an audio source separation model based on the SepFormer architecture, trained on the Libri3Mix dataset, capable of separating mixed speech into multiple independent sound sources.
Sound Separation English
S
speechbrain
1,511
7
Sepformer Libri2mix
Apache-2.0
Audio source separation model implemented with SepFormer architecture, trained on the Libri2Mix dataset, capable of separating independent sound sources from mixed audio
Sound Separation English
S
speechbrain
783
6
Convtasnet Libri3Mix Sepnoisy
ConvTasNet model trained on the Asteroid framework for noisy audio separation tasks, with training data from the Libri3Mix dataset.
Sound Separation
C
mpariente
30
0
Sepformer Wsj03mix
Apache-2.0
This is an audio source separation model using the SepFormer architecture, trained on the WSJ0-3Mix dataset, capable of separating mixed speech into independent speech sources.
Sound Separation English
S
speechbrain
158
6
Convtasnet Libri2Mix Sepnoisy 16k
ConvTasNet model trained on the Asteroid framework for noisy speech separation tasks, trained on the Libri2Mix dataset.
Sound Separation
C
JorisCos
8,407
1
Convtasnet Libri3Mix Sepnoisy 8k
A ConvTasNet model trained based on the Asteroid framework, designed to separate 3 independent audio sources from mixed audio, specifically optimized for noisy speech data at 8kHz sampling rate.
Sound Separation
C
JorisCos
33
2
Convtasnet Libri3Mix Sepclean 16k
A ConvTasNet model trained on the Asteroid framework for speech separation tasks, trained on the Libri3Mix dataset, supporting 16kHz sample rate audio input.
Sound Separation
C
JorisCos
48
0
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase