S

Sepformer Wham16k Enhancement

Developed by speechbrain
This is a speech enhancement model using the SepFormer architecture, specifically designed to remove noise and reverberation from audio, trained on the WHAM! dataset at a 16kHz sampling rate.
Downloads 5,140
Release Time : 6/30/2022

Model Overview

This model is implemented based on the Transformer architecture's SepFormer, primarily used for speech enhancement tasks, effectively removing environmental noise and reverberation effects from audio.

Model Features

Efficient Denoising
Effectively removes environmental noise and reverberation effects from audio.
Transformer-based Architecture
Utilizes the advanced SepFormer architecture, incorporating self-attention mechanisms for speech separation.
16kHz High Sampling Rate
Supports audio processing at a 16kHz sampling rate, delivering higher-quality audio enhancement.

Model Capabilities

Audio Denoising
Speech Enhancement
Reverberation Removal

Use Cases

Audio Processing
Speech Enhancement
Enhances clarity of speech containing environmental noise
SI-SNR improved to 14.3dB, PESQ reached 2.20
Conference Recording Processing
Removes background noise and room reverberation from conference recordings
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase