M

Mtl Mimic Voicebank

Developed by speechbrain
SpeechBrain-based speech enhancement and robust ASR training system using mimic loss training strategy
Downloads 11.11k
Release Time : 3/2/2022

Model Overview

This model achieves speech enhancement and automatic speech recognition (ASR) through a three-stage training process, supporting 16kHz single-channel audio processing, including pre-trained perceptual model, enhancement model training, and ASR fine-tuning modules

Model Features

Mimic Loss Training
Adopts a three-stage training strategy, guiding the enhancement model learning through pre-trained perceptual models
Joint Optimization
Enhancement model and ASR model can be used independently or jointly, improving system flexibility
Standardized Processing
Automatically processes 16kHz single-channel audio, supports resampling and mono conversion

Model Capabilities

Speech Enhancement
Noise Suppression
Robust Speech Recognition
Audio Feature Extraction

Use Cases

Voice Communication
Speech Enhancement in Noisy Environments
Improves speech clarity in background noise environments
PESQ 3.05 / COVL 3.74 (test set)
Speech Recognition
ASR in Noisy Environments
Improves speech recognition accuracy in noisy environments
WER 2.80 (test set)
Featured Recommended AI Models
ยฉ 2025AIbase