W

Wav2vec2 Final 1 Lm 4

Developed by chrisvinsen
A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving a word error rate of 0.4499 on the evaluation set
Downloads 16
Release Time : 6/2/2022

Model Overview

This is a speech recognition model based on the wav2vec2 architecture, fine-tuned for speech-to-text tasks.

Model Features

Low Word Error Rate
Word error rate of 0.4499 on the evaluation set, which can be reduced to 0.126 when using a 5-Gram language model
Based on wav2vec2 Architecture
Utilizes facebook/wav2vec2-base as the foundational model for fine-tuning
Linear Learning Rate Scheduling
Employs a linear learning rate scheduler during training, including 800 warm-up steps

Model Capabilities

Speech-to-Text
Automatic Speech Recognition

Use Cases

Speech Transcription
Meeting Minutes
Automatically convert meeting recordings into written transcripts
Word error rate of 0.4499
Voice Notes
Convert voice memos into searchable text
Word error rate drops to 0.126 when using a 5-Gram language model
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase