S

Sepformer Whamr

Developed by speechbrain
SepFormer is a Transformer-based audio source separation model trained on the WHAMR! dataset, designed to separate mixed speech signals.
Downloads 1,692
Release Time : 3/2/2022

Model Overview

This model utilizes the SepFormer architecture, specifically designed for audio source separation tasks, capable of isolating different speech sources from mixed audio, particularly effective in scenarios with environmental noise and reverberation.

Model Features

Transformer-based Separation Architecture
Uses the SepFormer architecture, combining Transformer's self-attention mechanism to effectively handle audio separation tasks.
Noise and Reverberation Robustness
Trained on the WHAMR! dataset containing environmental noise and reverberation, demonstrating strong noise robustness.
High Performance Metrics
Achieves separation performance of 13.7 dB SI-SNRi and 12.7 dB SDRi on the WHAMR! test set.

Model Capabilities

Speech Separation
Audio Source Separation
Noise Suppression
Reverberation Elimination

Use Cases

Speech Enhancement
Meeting Recording Separation
Separates individual speakers' voices from multi-speaker meeting recordings
Improves speech clarity and intelligibility
Noisy Environment Speech Separation
Isolates target speech from recordings with background noise
Enhances speech quality for subsequent processing
Audio Processing
Music Vocal Separation
Separates vocals and accompaniment from music recordings
Facilitates music production and post-processing
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase