W

Wav2vec2 Final 1 Lm 2

Developed by chrisvinsen
A fine-tuned speech recognition model based on facebook/wav2vec2-base, with a Word Error Rate (WER) of 0.283, and 0.126 when using 3-gram
Downloads 15
Release Time : 6/2/2022

Model Overview

This is a fine-tuned model for speech recognition, based on the wav2vec2 architecture, trained on a specific dataset

Model Features

Low Word Error Rate
Word Error Rate on the evaluation set is 0.4499, reduced to 0.126 when using 3-gram
Based on wav2vec2 Architecture
Uses facebook's wav2vec2-base as the base model for fine-tuning
Optimized Training
Trained for 60 epochs with linear learning rate scheduling and warm-up strategy

Model Capabilities

Speech Recognition
Audio to Text Conversion

Use Cases

Speech Transcription
Meeting Minutes Transcription
Convert meeting recordings into text transcripts
Word Error Rate 0.283
Voice Command Recognition
Recognize and understand voice commands
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase