S

Sepformer Whamr16k

Developed by speechbrain
This is an audio source separation model based on the SepFormer architecture, trained on the WHAMR! dataset, suitable for separating audio signals at a 16kHz sampling rate.
Downloads 4,702
Release Time : 3/2/2022

Model Overview

The model is implemented using SpeechBrain and is specifically designed to separate different audio sources from mixed audio, especially in scenarios involving environmental noise and reverberation.

Model Features

Efficient Audio Separation
Capable of effectively separating different audio sources from mixed audio containing environmental noise and reverberation.
Transformer-based Architecture
Utilizes the SepFormer architecture, leveraging the self-attention mechanism of Transformers to enhance separation performance.
16kHz Sampling Rate Support
Optimized specifically for audio signals at a 16kHz sampling rate, suitable for various practical application scenarios.

Model Capabilities

Audio Source Separation
Speech Separation
Noise Suppression

Use Cases

Speech Processing
Meeting Recording Separation
Separate individual speech signals of speakers from multi-person meeting recordings.
Achieves a performance of 13.5 dB SI-SNRi on the WHAMR! test set.
Speech Enhancement in Noisy Environments
Extract clear speech signals from noisy environments.
Performs well on datasets containing environmental noise and reverberation.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase