A

Asr Wav2vec2 Commonvoice En

Developed by speechbrain
This is an end-to-end automatic speech recognition system trained on the CommonVoice English dataset, combining the wav2vec 2.0 pre-trained model and CTC decoder.
Downloads 681
Release Time : 3/2/2022

Model Overview

This model is used for English speech recognition tasks, employing wav2vec 2.0 as the acoustic feature extractor and combining it with a CTC decoder for end-to-end training.

Model Features

End-to-end speech recognition
Combines wav2vec 2.0 pre-trained model and CTC decoder to achieve a complete speech recognition pipeline
No language model dependency
The system does not rely on external language models, simplifying deployment
Automatic audio preprocessing
Built-in audio normalization, including resampling and mono channel selection

Model Capabilities

English speech recognition
audio transcription
batch speech processing

Use Cases

Speech transcription
Automatic meeting transcription
Automatically convert English meeting recordings into text transcripts
Word Error Rate 15.69% (on CommonVoice test set)
Voice note conversion
Convert voice memos into editable text
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase