W

Wsj0 2mix Skim Small Causal

Developed by lichenda
This is a speech enhancement model trained based on the ESPnet framework, specifically designed for speech separation tasks in the wsj0_2mix dataset.
Downloads 26
Release Time : 5/17/2023

Model Overview

The model adopts a skim architecture with causal processing capabilities, suitable for real-time speech enhancement scenarios, effectively separating different speaker signals from mixed speech.

Model Features

Causal processing capability
The model features a causal structure design, making it suitable for real-time speech processing applications
Lightweight architecture
Small skim architecture design reduces computational complexity while maintaining performance
Multi-speaker separation
Capable of effectively separating signals from two speakers in mixed speech

Model Capabilities

Speech enhancement
Speaker separation
Real-time speech processing

Use Cases

Voice communication
Conference speech enhancement
Separating voices of different speakers in multi-person meeting scenarios
STOI metric reaches 94.20, SDR metric 14.33
Speech recognition preprocessing
ASR front-end processing
Providing cleaner input signals for speech recognition systems
Improves speech recognition system accuracy in noisy environments
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase