wav2vec2-7 Open-source Speech Recognition Model - Free Deployment, Word Error Rate in Evaluation Set Only 0.52

Wav2vec2 7

Developed by chrisvinsen

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base, achieving a word error rate of 0.52 on the evaluation set.

Downloads 20

Release Time : 5/23/2022

Model Overview

wav2vec2-7 is a speech recognition model based on the wav2vec2 architecture, primarily used for converting speech to text.

Low Word Error Rate

Achieved a word error rate of 0.52 on the evaluation set, demonstrating good performance.

Based on wav2vec2 Architecture

Fine-tuned from facebook/wav2vec2-base, inheriting its excellent speech feature extraction capabilities.

Linear Learning Rate Scheduling

Utilized linear learning rate scheduling and warm-up steps during training, optimizing training effectiveness.

Speech Recognition

Audio to Text Conversion

Speech Transcription

Meeting Minutes

Convert meeting recordings into text transcripts

Word error rate 0.52

Voice Assistant

Used as the speech recognition module for voice assistants

Training Loss	Epoch	Step	Validation Loss	Wer
5.1311	1.56	200	2.9839	1.0
2.5727	3.12	400	1.4962	1.0209
1.0187	4.69	600	0.7562	0.7859
0.637	6.25	800	0.6529	0.6960
0.4847	7.81	1000	0.6609	0.6745
0.3952	9.38	1200	0.5808	0.6220
0.3343	10.94	1400	0.5622	0.6004
0.2897	12.5	1600	0.8842	0.5980
0.2549	14.06	1800	0.6047	0.5765
0.2334	15.62	2000	0.6436	0.5699
0.2144	17.19	2200	0.5831	0.5593
0.1982	18.75	2400	0.6327	0.5620
0.1817	20.31	2600	0.8790	0.5456
0.1713	21.88	2800	0.9603	0.5362
0.163	23.44	3000	0.5940	0.5384
0.1539	25.0	3200	0.6058	0.5311
0.1392	26.56	3400	0.6131	0.5221
0.1386	28.12	3600	0.6066	0.5258
0.1351	29.69	3800	0.6017	0.5200

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base