S

Simpleoier Librispeech Asr Train Asr Conformer7 Wavlm Large Raw En Bpe5000 Sp

Developed by espnet
An automatic speech recognition (ASR) model trained on the ESPnet framework, using the Conformer architecture and the WavLM large pre-trained model, trained on the LibriSpeech dataset.
Downloads 66
Release Time : 3/2/2022

Model Overview

This model is a high-performance English automatic speech recognition system designed to process raw audio input and convert it into text.

Model Features

High-performance architecture
Combines Conformer7 and the WavLM large pre-trained model to deliver exceptional speech recognition capabilities
LibriSpeech training
Trained on the widely-used LibriSpeech dataset, ensuring robustness in various speech conditions
Low error rate
Outstanding performance on test sets, with a word error rate (WER) as low as 1.8% on clean speech and 3.7% on noisy speech

Model Capabilities

English speech recognition
Raw audio processing
Large-scale speech-to-text conversion

Use Cases

Speech transcription
Meeting minutes
Automatically transcribe meeting recordings
Accuracy up to 98.4% (test set clean data)
Audio caption generation
Generate subtitles for podcasts or video content
Maintains 96.7% accuracy even in noisy speech environments
Voice assistants
Voice command recognition
Recognize and execute voice commands
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase