W

Wav2vec2 Base Repro Timit

Developed by patrickvonplaten
This model is an automatic speech recognition model fine-tuned on the TIMIT_ASR - NA dataset, based on patrickvonplaten/wav2vec2-base-repro-960h-libri-85k-steps.
Downloads 20
Release Time : 3/2/2022

Model Overview

This is an English speech recognition model based on the wav2vec2 architecture, fine-tuned on the TIMIT_ASR dataset, and can be used to convert English speech to text.

Model Features

Based on wav2vec2 architecture
Utilizes Facebook AI's wav2vec2 architecture, delivering excellent speech recognition performance.
Fine-tuned on TIMIT ASR dataset
Fine-tuned on the TIMIT ASR dataset, optimized for English speech recognition.
Gradual performance improvement
Training results show that the model progressively improved recognition accuracy over 20 epochs.

Model Capabilities

English speech recognition
Audio to text conversion

Use Cases

Speech transcription
English speech to text
Convert English speech content into text format
Word Error Rate (WER) 0.5484
Speech assistive technology
Voice command recognition
Recognize simple voice commands
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase