S

Sepformer Wham

Developed by speechbrain
This is an audio source separation model based on the SepFormer architecture, trained on the WHAM! dataset, capable of separating different sound sources from mixed audio.
Downloads 1,828
Release Time : 3/2/2022

Model Overview

The model uses Transformer architecture for audio source separation, particularly suitable for processing mixed speech signals with environmental noise.

Model Features

High-Performance Separation
Achieves separation performance of 16.3 dB SI-SNRi and 16.7 dB SDRi on the WHAM! test set.
Environmental Noise Handling
Specifically optimized for mixed speech signals with environmental noise.
Transformer-Based
Utilizes the advanced SepFormer architecture with attention mechanisms for efficient separation.

Model Capabilities

Audio Source Separation
Speech Separation
Noisy Environment Speech Processing

Use Cases

Speech Processing
Meeting Recording Separation
Separate individual speakers' voices from multi-person meeting recordings.
Improves speech recognition accuracy.
Noisy Environment Speech Enhancement
Extract clear speech from recordings with background noise.
Enhances speech quality and intelligibility.
Featured Recommended AI Models
ยฉ 2025AIbase