Wav2vec2 Base Repro Timit
This model is an automatic speech recognition model fine-tuned on the TIMIT_ASR - NA dataset, based on patrickvonplaten/wav2vec2-base-repro-960h-libri-85k-steps.
Downloads 20
Release Time : 3/2/2022
Model Overview
This is an English speech recognition model based on the wav2vec2 architecture, fine-tuned on the TIMIT_ASR dataset, and can be used to convert English speech to text.
Model Features
Based on wav2vec2 architecture
Utilizes Facebook AI's wav2vec2 architecture, delivering excellent speech recognition performance.
Fine-tuned on TIMIT ASR dataset
Fine-tuned on the TIMIT ASR dataset, optimized for English speech recognition.
Gradual performance improvement
Training results show that the model progressively improved recognition accuracy over 20 epochs.
Model Capabilities
English speech recognition
Audio to text conversion
Use Cases
Speech transcription
English speech to text
Convert English speech content into text format
Word Error Rate (WER) 0.5484
Speech assistive technology
Voice command recognition
Recognize simple voice commands
Featured Recommended AI Models