Wav2vec2 Base Timit Demo
A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving a 28.25% word error rate on the TIMIT dataset
Downloads 21
Release Time : 4/20/2022
Model Overview
This is a pre-trained model for English speech recognition, fine-tuned based on the wav2vec2 architecture, suitable for Automatic Speech Recognition (ASR) tasks
Model Features
Low Word Error Rate
Achieves a 28.25% word error rate (WER) on the evaluation set
Based on wav2vec2 Architecture
Uses facebook's wav2vec2-base as the base model
End-to-End Training
Learns speech representations directly from raw audio without manual feature extraction
Model Capabilities
English Speech Recognition
Audio to Text
Automatic Speech Transcription
Use Cases
Speech Transcription
Meeting Minutes
Automatically convert meeting recordings into text transcripts
Accuracy approximately 71.75%
Voice Notes
Convert voice memos into searchable text
Featured Recommended AI Models
Š 2025AIbase