Wav2vec2 Final 1 Lm 3
A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving a word error rate of 0.4499 on the evaluation set, which can be reduced to 0.126 when using a 4-Gram language model
Downloads 16
Release Time : 6/2/2022
Model Overview
This is an automatic speech recognition (ASR) model based on the wav2vec2 architecture, fine-tuned on a specific dataset, suitable for speech-to-text tasks
Model Features
Low Word Error Rate
Base word error rate of 0.4499, which can be reduced to 0.126 when using a 4-Gram language model
Based on wav2vec2 Architecture
Uses facebook/wav2vec2-base as the base model, with excellent speech feature extraction capabilities
Fine-tuning
Trained for 60 epochs, progressively optimizing model performance
Model Capabilities
Speech Recognition
Audio to Text
Speech Content Analysis
Use Cases
Speech Transcription
Meeting Minutes
Automatically convert meeting recordings into text transcripts
Accuracy approximately 55.01% (word error rate 0.4499)
Voice Notes
Convert voice memos into searchable text
Accuracy can reach 87.4% when using a 4-Gram language model
Featured Recommended AI Models
Š 2025AIbase