W

Wav2vec2 Final 1 Lm 3

Developed by chrisvinsen
A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving a word error rate of 0.4499 on the evaluation set, which can be reduced to 0.126 when using a 4-Gram language model
Downloads 16
Release Time : 6/2/2022

Model Overview

This is an automatic speech recognition (ASR) model based on the wav2vec2 architecture, fine-tuned on a specific dataset, suitable for speech-to-text tasks

Model Features

Low Word Error Rate
Base word error rate of 0.4499, which can be reduced to 0.126 when using a 4-Gram language model
Based on wav2vec2 Architecture
Uses facebook/wav2vec2-base as the base model, with excellent speech feature extraction capabilities
Fine-tuning
Trained for 60 epochs, progressively optimizing model performance

Model Capabilities

Speech Recognition
Audio to Text
Speech Content Analysis

Use Cases

Speech Transcription
Meeting Minutes
Automatically convert meeting recordings into text transcripts
Accuracy approximately 55.01% (word error rate 0.4499)
Voice Notes
Convert voice memos into searchable text
Accuracy can reach 87.4% when using a 4-Gram language model
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase