Iwslt Asr Wav2vec Large 4500h
I
Iwslt Asr Wav2vec Large 4500h
Developed by nguyenvulebinh
A large-scale English automatic speech recognition model based on the Wav2Vec2 architecture, fine-tuned on 4500 hours of multi-source speech data, supporting decoding with a language model
Downloads 27
Release Time : 3/23/2022
Model Overview
This model is an English automatic speech recognition system fine-tuned based on Facebook's Wav2Vec2 architecture, integrating a language model to improve transcription accuracy, suitable for speech-to-text tasks with various English accents
Model Features
Multi-source data training
Trained on 7 different speech datasets with a total duration of over 4500 hours
Language model integration
Provides a processor with a language model, significantly reducing word error rate
High-performance transcription
Achieves a word error rate of 1.1% on free speech test sets (with language model)
Model Capabilities
English speech recognition
Speech decoding with language model
Multi-accent English processing
Use Cases
Speech transcription
Meeting minutes
Automatically convert English meeting recordings into text transcripts
Word error rate of only 1.1% on free speech test sets
Educational content transcription
Convert English teaching videos/audio into text
5.4% word error rate on TED talk data
Featured Recommended AI Models
Š 2025AIbase