W

Wangyou Zhang Chime4 Enh Train Enh Conv Tasnet Raw

Developed by espnet
A speech enhancement model trained based on the ESPnet framework, using the chime4 dataset, suitable for single-channel speech enhancement tasks.
Downloads 57
Release Time : 4/11/2022

Model Overview

This model adopts the Conv-TasNet architecture, specifically designed for speech enhancement tasks, capable of separating clear speech signals from noisy mixed speech.

Model Features

Based on Conv-TasNet Architecture
Uses Temporal Convolutional Network for speech separation, featuring efficient feature extraction capabilities.
End-to-end Training
Learns the mapping from raw audio to target speech directly, without complex feature engineering.
SI-SNR Optimization Objective
Uses Scale-Invariant Signal-to-Noise Ratio (SI-SNR) as the loss function to optimize speech quality.

Model Capabilities

Single-channel Speech Enhancement
Noise Suppression
Speech Separation

Use Cases

Speech Processing
Conference Speech Enhancement
Extracts clear speech signals in noisy conference environments
Improves speech recognition accuracy and intelligibility
Telephone Speech Enhancement
Enhances voice quality in mobile communications
Improves call experience
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase