I

Iwslt Asr Wav2vec Large 4500h

Developed by nguyenvulebinh
A large-scale English automatic speech recognition model based on the Wav2Vec2 architecture, fine-tuned on 4500 hours of multi-source speech data, supporting decoding with a language model
Downloads 27
Release Time : 3/23/2022

Model Overview

This model is an English automatic speech recognition system fine-tuned based on Facebook's Wav2Vec2 architecture, integrating a language model to improve transcription accuracy, suitable for speech-to-text tasks with various English accents

Model Features

Multi-source data training
Trained on 7 different speech datasets with a total duration of over 4500 hours
Language model integration
Provides a processor with a language model, significantly reducing word error rate
High-performance transcription
Achieves a word error rate of 1.1% on free speech test sets (with language model)

Model Capabilities

English speech recognition
Speech decoding with language model
Multi-accent English processing

Use Cases

Speech transcription
Meeting minutes
Automatically convert English meeting recordings into text transcripts
Word error rate of only 1.1% on free speech test sets
Educational content transcription
Convert English teaching videos/audio into text
5.4% word error rate on TED talk data
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase