Open-source speech recognition model wav2vec2-final-1-lm-4 - Accurately recognize speech content with low error rate

Wav2vec2 Final 1 Lm 4

Developed by chrisvinsen

A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving a word error rate of 0.4499 on the evaluation set

Speech Recognition

Transformers

Open Source License:Apache-2.0 #Speech Recognition #Low Word Error Rate #Fine-tuned wav2vec2

Downloads 16

Release Time : 6/2/2022

Model Overview

This is a speech recognition model based on the wav2vec2 architecture, fine-tuned for speech-to-text tasks.

Model Features

Low Word Error Rate

Word error rate of 0.4499 on the evaluation set, which can be reduced to 0.126 when using a 5-Gram language model

Based on wav2vec2 Architecture

Utilizes facebook/wav2vec2-base as the foundational model for fine-tuning

Linear Learning Rate Scheduling

Employs a linear learning rate scheduler during training, including 800 warm-up steps

Model Capabilities

Speech-to-Text

Automatic Speech Recognition

Use Cases

Speech Transcription

Meeting Minutes

Automatically convert meeting recordings into written transcripts

Word error rate of 0.4499

Voice Notes

Convert voice memos into searchable text

Word error rate drops to 0.126 when using a 5-Gram language model

Training Loss	Epoch	Step	Validation Loss	Wer
3.4816	2.74	400	1.0717	0.8927
0.751	5.48	800	0.7155	0.7533
0.517	8.22	1200	0.7039	0.6675
0.3988	10.96	1600	0.5935	0.6149
0.3179	13.7	2000	0.6477	0.5999
0.2755	16.44	2400	0.5549	0.5798
0.2343	19.18	2800	0.6626	0.5798
0.2103	21.92	3200	0.6488	0.5674
0.1877	24.66	3600	0.5874	0.5339
0.1719	27.4	4000	0.6354	0.5389
0.1603	30.14	4400	0.6612	0.5210
0.1401	32.88	4800	0.6676	0.5131
0.1286	35.62	5200	0.6366	0.5075
0.1159	38.36	5600	0.6064	0.4977
0.1084	41.1	6000	0.6530	0.4835
0.0974	43.84	6400	0.6118	0.4853
0.0879	46.58	6800	0.6316	0.4770
0.0815	49.32	7200	0.6125	0.4664
0.0708	52.05	7600	0.6449	0.4683
0.0651	54.79	8000	0.6068	0.4571
0.0555	57.53	8400	0.6305	0.4499

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Wav2vec2 Final 1 Lm 4

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 wav2vec2-19

🚀 Quick Start

📚 Documentation

Model description

Intended uses & limitations

Training and evaluation data

🔧 Technical Details

Training procedure

Training hyperparameters

Training results

Framework versions

📄 License