Wav2vec2-base-checkpoint-12 Open Source Model - Free Deployment, Efficiently Boosting Speech Recognition Tasks

Wav2vec2 Base Checkpoint 12

Developed by jiobiala24

This model is a fine-tuned version based on wav2vec2-base-checkpoint-11.1 on the Common Voice dataset, primarily used for speech recognition tasks.

Downloads 16

Release Time : 3/2/2022

Model Overview

wav2vec2-base-checkpoint-12 is a speech recognition model based on the wav2vec2 architecture, fine-tuned on the Common Voice dataset.

Efficient Fine-tuning

Fine-tuned on the Common Voice dataset based on wav2vec2-base-checkpoint-11.1, optimizing speech recognition performance.

Low Word Error Rate

Achieved a word error rate (WER) of 0.3452 on the evaluation set, demonstrating good performance.

Mixed Precision Training

Used native AMP for mixed precision training, improving training efficiency.

Speech Recognition

Audio to Text

Speech Transcription

Speech to Text

Convert speech audio into text content

Word error rate 0.3452

Training Loss	Epoch	Step	Validation Loss	Wer
0.2793	1.64	1000	0.5692	0.3518
0.2206	3.28	2000	0.6127	0.3460
0.1733	4.93	3000	0.6622	0.3580
0.1391	6.57	4000	0.6768	0.3519
0.1193	8.21	5000	0.7559	0.3540
0.1053	9.85	6000	0.7873	0.3562
0.093	11.49	7000	0.8170	0.3612
0.0833	13.14	8000	0.8682	0.3579
0.0753	14.78	9000	0.8317	0.3573
0.0698	16.42	10000	0.9213	0.3525
0.0623	18.06	11000	0.9746	0.3531
0.0594	19.7	12000	1.0027	0.3502
0.0538	21.35	13000	1.0045	0.3545
0.0504	22.99	14000	0.9821	0.3523
0.0461	24.63	15000	1.0818	0.3462
0.0439	26.27	16000	1.0995	0.3495
0.0421	27.91	17000	1.0533	0.3430
0.0415	29.56	18000	1.0795	0.3452

Property	Details
Model Type	Fine - tuned wav2vec2 model
Training Data	common_voice dataset

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base