W

Wav2vec2 Base Ft Cv3 V3

Developed by danieleV9H
This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base on the Common Voice 3.0 English dataset, achieving a word error rate of 0.247 on the test set.
Downloads 120
Release Time : 6/25/2022

Model Overview

A fine-tuned model for English speech recognition, based on the wav2vec2 architecture and trained on the Common Voice dataset.

Model Features

Low Word Error Rate
Achieved a word error rate of 0.247 on the Common Voice test set, demonstrating excellent performance.
Based on wav2vec2 Architecture
Uses Facebook's wav2vec2-base as the base model, featuring powerful speech feature extraction capabilities.
Linear Learning Rate Scheduling
Employs a linear learning rate scheduling strategy during training, aiding in stable model convergence.

Model Capabilities

English Speech Recognition
Audio-to-Text Conversion

Use Cases

Speech Transcription
Voice Memo Transcription
Automatically converts user voice memos into text
Approximately 75.3% accuracy (based on 1-WER calculation)
Meeting Minutes
Automatically generates text versions of meeting audio recordings
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase