Wav2vec2 Large 960h Lv60 Self 4 Gram
Based on Facebook's Wav2Vec2-Large-960h-lv60-self model, enhanced with an English 4-gram language model to improve speech recognition accuracy
Downloads 22
Release Time : 4/12/2022
Model Overview
This is an automatic speech recognition (ASR) model specifically designed for English speech-to-text tasks, significantly improving recognition accuracy through the integration of a 4-gram language model.
Model Features
4-gram language model integration
Incorporates the official Librispeech 4-gram language model, significantly improving speech recognition accuracy
High-performance recognition
Achieves word error rates (WER) of 1.84 (clean) and 3.71 (other) on the LibriSpeech test set
Based on Wav2Vec2 architecture
Utilizes Facebook's advanced Wav2Vec2-Large-960h-lv60-self architecture
Model Capabilities
English speech recognition
High-accuracy speech-to-text conversion
Processing 16kHz sampling rate audio
Use Cases
Speech transcription
Audiobook transcription
Automatically transcribes English audiobook content into text
Achieves a word error rate of only 1.84 (clean) on the LibriSpeech test set
Meeting minutes
Automatically records English meeting content and generates transcripts
Achieves a word error rate of 3.71 on non-standard speech (other) test sets
Featured Recommended AI Models
Š 2025AIbase