Wav2vec2-base-vios-commonvoice-1 Open-source Speech Recognition Model - Free Deployment with Support for Automatic Speech Recognition

Wav2vec2 Base Vios Commonvoice 1

Developed by tclong

This model is a speech recognition model fine-tuned on the Common Voice dataset based on facebook/wav2vec2-xls-r-300m, supporting automatic speech recognition tasks.

Speech Recognition

Transformers

Open Source License:Apache-2.0 #Speech Recognition #Low Word Error Rate #Multilingual Support

Downloads 21

Release Time : 6/10/2022

Model Overview

This is a speech recognition model based on the wav2vec2 architecture, fine-tuned for converting speech to text.

Model Features

Based on wav2vec2 architecture

Utilizes the advanced wav2vec2 architecture to provide high-quality speech recognition capabilities

Fine-tuning optimization

Fine-tuned on the Common Voice dataset to optimize recognition performance

Low Word Error Rate

Achieved a word error rate (WER) of 0.3621 on the evaluation set

Model Capabilities

Speech Recognition

Audio to Text Conversion

Use Cases

Speech Transcription

Speech-to-Text Service

Convert speech content into text transcripts

Word error rate 0.3621

Assistive Technology

Real-time Caption Generation

Generate real-time captions for video or live streaming content

🚀 wav2vec2-base-vios-commonvoice-1

This model is a fine - tuned version of facebook/wav2vec2-xls-r-300m on the None dataset. It offers specific evaluation results, which can be used for speech - related tasks.

🚀 Quick Start

This model is a fine - tuned version of facebook/wav2vec2-xls-r-300m on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.8913
Wer: 0.3621

📚 Documentation

Training and Evaluation

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e - 05
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 1000
num_epochs: 30
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
3.4706	0.55	500	3.4725	1.0
3.202	1.1	1000	2.7555	1.0008
1.0507	1.66	1500	1.0481	0.6196
0.7325	2.21	2000	0.8120	0.4958
0.599	2.76	2500	0.7035	0.4447
0.5224	3.31	3000	0.6761	0.4078
0.4844	3.86	3500	0.6688	0.4011
0.4234	4.42	4000	0.6080	0.3729
0.4237	4.97	4500	0.5953	0.3556
0.3986	5.52	5000	0.6054	0.3478
0.3554	6.07	5500	0.6193	0.3479
0.3446	6.62	6000	0.5809	0.3302
0.3104	7.17	6500	0.5713	0.3283
0.3166	7.73	7000	0.5593	0.3133
0.2938	8.28	7500	0.5645	0.3081
0.3061	8.83	8000	0.5508	0.3020
0.2986	9.38	8500	0.5462	0.3024
0.2939	9.93	9000	0.5544	0.3028
0.2633	10.49	9500	0.5496	0.3024
0.2683	11.04	10000	0.5439	0.2946
0.2714	11.59	10500	0.5524	0.2947
0.2354	12.14	11000	0.5267	0.2918
0.2488	12.69	11500	0.5728	0.2938
0.2479	13.25	12000	0.5802	0.2951
0.245	13.8	12500	0.5571	0.2890
0.2422	14.35	13000	0.5531	0.2871
0.2369	14.9	13500	0.5453	0.2860
0.2345	15.45	14000	0.5452	0.2847
0.2507	16.0	14500	0.5536	0.2884
0.2454	16.56	15000	0.5577	0.2871
0.2729	17.11	15500	0.6019	0.2931
0.2743	17.66	16000	0.5619	0.2905
0.3031	18.21	16500	0.6401	0.3006
0.315	18.76	17000	0.6044	0.2990
0.4025	19.32	17500	0.6739	0.3304
0.4915	19.87	18000	0.7267	0.3472
0.5539	20.42	18500	0.8078	0.3483
0.7138	20.97	19000	0.9362	0.3765
0.5766	21.52	19500	0.7921	0.3392
0.688	22.08	20000	0.8833	0.3693
0.6964	22.63	20500	0.9137	0.3469
0.7389	23.18	21000	0.9379	0.3460
0.7851	23.73	21500	1.0438	0.3653
0.7619	24.28	22000	0.9313	0.3873
0.7175	24.83	22500	0.8668	0.3789
0.6842	25.39	23000	0.8243	0.3761
0.6941	25.94	23500	0.8557	0.3804
0.7167	26.49	24000	0.8618	0.3875
0.721	27.04	24500	0.8686	0.3764
0.6949	27.59	25000	0.8773	0.3690
0.727	28.15	25500	0.8769	0.3666
0.7363	28.7	26000	0.8867	0.3634
0.7157	29.25	26500	0.8895	0.3626
0.7385	29.8	27000	0.8913	0.3621

Framework versions

Transformers 4.19.3
Pytorch 1.11.0+cu113
Datasets 2.2.2
Tokenizers 0.12.1

📄 License

This project is licensed under the Apache - 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご