Wav2vec2 Xls R 300m English
XLS-R-300M is an English automatic speech recognition model fine-tuned on the librispeech_asr dataset based on facebook/wav2vec2-xls-r-300m, achieving a word error rate of 12.29% on the LibriSpeech test set.
Downloads 21
Release Time : 3/2/2022
Model Overview
This model is an English automatic speech recognition (ASR) model, specifically optimized for English speech-to-text conversion tasks.
Model Features
Excellent Performance on Multiple Datasets
Evaluated on multiple datasets including LibriSpeech, Common Voice, and Robust Speech Events, demonstrating stable performance.
Efficient Training
Utilizes techniques such as gradient accumulation and mixed-precision training to improve training efficiency.
Low Word Error Rate
Achieves a word error rate of 12.29% on the LibriSpeech clean test set, demonstrating excellent performance.
Model Capabilities
English Speech Recognition
Speech-to-Text
Long Audio Processing
Use Cases
Speech Transcription
Audiobook Transcription
Transcribe audiobook content into text
Word error rate of 12.29% on the LibriSpeech test set
Voice Assistants
Voice Command Recognition
Recognize and understand user voice commands
Word error rate of 38.8% on the Robust Speech Events test set
Featured Recommended AI Models
Š 2025AIbase