wav2vec2-2-rnd Open Source Automatic Speech Recognition Model - Free Deployment for English Speech-to-Text

Wav2vec2 2 Rnd

Developed by sanchit-gandhi

An automatic speech recognition model trained on the LibriSpeech ASR dataset, designed to convert English speech into text.

Speech Recognition

Transformers

#High-precision speech-to-text #Low word error rate #English speech recognition

Downloads 16

Release Time : 3/6/2022

Model Overview

This model is an automatic speech recognition (ASR) system specifically designed for English speech, capable of converting speech signals into corresponding text.

Model Features

High accuracy

Achieved a word error rate of 0.1442 on the LibriSpeech evaluation set.

Optimized training process

Trained using the Adam optimizer and linear learning rate scheduler to ensure stable model convergence.

Mixed-precision training

Utilizes native AMP for mixed-precision training, improving training efficiency.

Model Capabilities

English speech recognition

Speech-to-text

Use Cases

Speech transcription

Meeting minutes

Automatically convert meeting recordings into text transcripts.

Highly accurate transcription results, reducing manual proofreading time.

Subtitle generation

Automatically generate English subtitles for video content.

Quick subtitle generation, improving video production efficiency.

Voice assistants

Voice command recognition

Used for voice assistants to recognize user voice commands.

Highly accurate command recognition, enhancing user experience.

Training Loss	Epoch	Step	Validation Loss	Wer
6.1431	1.68	1500	6.0870	1.4277
5.498	3.36	3000	5.5505	1.6318
3.575	5.04	4500	3.7856	0.6683
1.7532	6.73	6000	2.4603	0.3576
1.6379	8.41	7500	1.8847	0.2932
1.3145	10.09	9000	1.5027	0.2222
0.8389	11.77	10500	1.2637	0.1855
0.9239	13.45	12000	1.1424	0.1683
0.6666	15.13	13500	1.0562	0.1593
0.5258	16.82	15000	0.9911	0.1489
0.4733	18.5	16500	0.9599	0.1442

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Wav2vec2 2 Rnd

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Speech Recognition Model

🚀 Quick Start

📚 Documentation

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions