Convtasnet WHAM Sepclean
This is a ConvTasNet model trained based on the Asteroid framework, specifically designed for audio separation tasks, trained on the sep_clean task of the WHAM! dataset.
Downloads 302
Release Time : 3/2/2022
Model Overview
This model is primarily used for audio-to-audio separation tasks, capable of separating different sources from mixed audio, particularly suitable for speech separation scenarios.
Model Features
Efficient Audio Separation
Utilizes the ConvTasNet architecture to efficiently separate different sources from mixed audio.
High-Quality Separation Performance
Performs excellently on the WHAM! dataset, achieving an SI-SDR metric of 16.21dB.
Lightweight Design
The model parameters are optimized, making it suitable for practical deployment.
Model Capabilities
Audio Separation
Speech Enhancement
Multi-Source Audio Processing
Use Cases
Speech Processing
Meeting Recording Separation
Separate mixed recordings of multi-person meetings into individual speaker audio tracks.
SI-SDR improvement of 16.21dB, speech intelligibility (STOI) reaching 0.96.
Audio Post-Production
Separate vocal parts from background music and sound effects.
SIR metric reaching 26.86dB, indicating excellent source separation capability.
Featured Recommended AI Models
Š 2025AIbase