W

Wav2vec2 Bert CV16 En

Developed by hf-audio
An automatic speech recognition (ASR) model fine-tuned on the Common Voice 16.0 English dataset based on w2v-bert-2.0
Downloads 1,700
Release Time : 1/5/2024

Model Overview

This model is an automatic speech recognition system for English, fine-tuned on the Common Voice 16.0 English dataset, capable of converting English speech into text

Model Features

Efficient Speech Recognition
Fine-tuned on the Common Voice 16.0 English dataset with high recognition accuracy
Low Word Error Rate
Achieves a word error rate (WER) of 14.55% and a character error rate (CER) of 5.8% on the evaluation set
Multi-GPU Training Optimization
Supports distributed training across multiple GPUs using the Adam optimizer and linear learning rate scheduling

Model Capabilities

English Speech Recognition
Speech-to-Text
Automatic Speech Transcription

Use Cases

Speech Transcription
Voice Memo Transcription
Automatically converts English voice memos into text
Approximately 85.45% accuracy (1-WER)
Meeting Minutes Automation
Automatically generates text records of English meetings
Assistive Technology
Real-time Caption Generation
Generates real-time captions for English video content
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase