A

Asr Wav2vec2 Librispeech

Developed by speechbrain
This is an end-to-end automatic speech recognition system trained on the LibriSpeech dataset, combining the wav2vec 2.0 pre-trained model and CTC technology, excelling in English speech recognition tasks.
Downloads 1,667
Release Time : 6/5/2022

Model Overview

This model is an English automatic speech recognition system, using the wav2vec 2.0 pre-trained model combined with CTC technology, fine-tuned on the LibriSpeech dataset. It can accurately convert English speech into text.

Model Features

High-precision speech recognition
Achieves a word error rate (WER) of 1.90% (clean) and 3.96% (other) on the LibriSpeech test set.
Pre-trained model fine-tuning
Based on the facebook/wav2vec2-large-960h-lv60-self pre-trained model, further fine-tuned on LibriSpeech.
End-to-end system
Includes a complete tokenizer and acoustic model, ready for direct use in speech-to-text tasks.
Easy to use
Provides a simple API interface, enabling speech transcription with just a few lines of code.

Model Capabilities

English speech recognition
Audio transcription
Automatic speech-to-text

Use Cases

Speech transcription
Meeting minutes
Automatically convert meeting recordings into text transcripts
Highly accurate transcription results
Voice notes
Convert voice memos into searchable text
Assistive technology
Real-time caption generation
Generate real-time captions for videos or live streams
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase