C

Convtasnet WHAM Sepclean

Developed by mpariente
This is a ConvTasNet model trained based on the Asteroid framework, specifically designed for audio separation tasks, trained on the sep_clean task of the WHAM! dataset.
Downloads 302
Release Time : 3/2/2022

Model Overview

This model is primarily used for audio-to-audio separation tasks, capable of separating different sources from mixed audio, particularly suitable for speech separation scenarios.

Model Features

Efficient Audio Separation
Utilizes the ConvTasNet architecture to efficiently separate different sources from mixed audio.
High-Quality Separation Performance
Performs excellently on the WHAM! dataset, achieving an SI-SDR metric of 16.21dB.
Lightweight Design
The model parameters are optimized, making it suitable for practical deployment.

Model Capabilities

Audio Separation
Speech Enhancement
Multi-Source Audio Processing

Use Cases

Speech Processing
Meeting Recording Separation
Separate mixed recordings of multi-person meetings into individual speaker audio tracks.
SI-SDR improvement of 16.21dB, speech intelligibility (STOI) reaching 0.96.
Audio Post-Production
Separate vocal parts from background music and sound effects.
SIR metric reaching 26.86dB, indicating excellent source separation capability.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase