Wav2vec2 Base Cv 10000
A speech recognition model fine-tuned on the Common Voice dataset based on wav2vec2-base-cv, achieving a word error rate of 36.84% on the evaluation set.
Downloads 28
Release Time : 3/8/2022
Model Overview
This model is a speech recognition model based on the wav2vec2 architecture, fine-tuned on the Common Voice dataset, suitable for speech-to-text tasks.
Model Features
Low Word Error Rate
Achieved a word error rate of 36.84% on the evaluation set, demonstrating good performance.
Based on wav2vec2 Architecture
Utilizes the wav2vec2-base architecture, which has excellent speech feature extraction capabilities.
Fine-tuning Optimization
Fine-tuned for 30 epochs on the Common Voice dataset, optimizing model performance.
Model Capabilities
Speech Recognition
Speech-to-Text
Use Cases
Speech Transcription
Meeting Minutes
Convert meeting speech into real-time text records
Accuracy approximately 63.16% (based on a 36.84% word error rate)
Voice Notes
Convert voice notes into editable text
Assistive Technology
Voice Control
Provide text conversion functionality for voice control applications
Featured Recommended AI Models
Š 2025AIbase