Open-source Automatic Speech Recognition Model of ai-light-dance_singing2_ft_wav2vec2 - Accurate Recognition for Efficient Applications

Ai Light Dance Singing2 Ft Wav2vec2 Large Xlsr 53

Developed by gary109

This model is an automatic speech recognition model fine-tuned on the AI Light Dance dataset based on facebook/wav2vec2-large-xlsr-53.

Speech Recognition

Transformers

Open Source License:Apache-2.0 #Voice recognition #XLSR-53 fine-tuning #Low word error rate

Downloads 26

Release Time : 6/23/2022

Model Overview

This is a fine-tuned model for speech recognition, specifically optimized for the singing2 data in the AI Light Dance dataset.

Model Features

Fine-tuned based on wav2vec2-large-xlsr-53

Fine-tuned on a powerful pre-trained model to improve recognition performance in specific domains

Optimized for singing2 data

Specially trained and optimized for the singing2 data in the AI Light Dance dataset

Model Capabilities

Speech recognition

Audio transcription

Use Cases

Speech processing

Song audio transcription

Convert singing audio into text

Word error rate 0.9386

🚀 ai-light-dance_singing2_ft_wav2vec2-large-xlsr-53

This model is a fine - tuned version of facebook/wav2vec2-large-xlsr-53 on the /WORKSPACE/ASANTE/AI-LIGHT-DANCE_DATASETS/AI_LIGHT_DANCE.PY - ONSET-SINGING2 dataset. It offers a solution for automatic speech recognition tasks, achieving specific evaluation results to enhance the accuracy of speech recognition.

🚀 Quick Start

This model is a fine - tuned version of facebook/wav2vec2-large-xlsr-53 on the /WORKSPACE/ASANTE/AI-LIGHT-DANCE_DATASETS/AI_LIGHT_DANCE.PY - ONSET-SINGING2 dataset. It achieves the following results on the evaluation set:

Loss: 1.7583
Wer: 0.9386

📚 Documentation

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 4e - 06
train_batch_size: 10
eval_batch_size: 10
seed: 42
gradient_accumulation_steps: 16
total_train_batch_size: 160
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e - 08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 30.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
27.4755	1.0	112	23.2618	1.0
5.5145	2.0	224	5.2213	1.0
4.2211	3.0	336	4.1673	1.0
3.8386	4.0	448	3.8253	1.0
3.5531	5.0	560	3.6286	1.0
3.5215	6.0	672	3.4762	0.9864
3.3493	7.0	784	3.3549	0.9847
3.1264	8.0	896	3.1797	0.9759
2.7557	9.0	1008	2.8703	0.9865
2.6345	10.0	1120	2.6736	0.9970
2.4297	11.0	1232	2.5638	1.0337
2.3057	12.0	1344	2.3680	0.9839
2.1436	13.0	1456	2.2367	0.9648
2.0856	14.0	1568	2.1635	0.9586
2.0035	15.0	1680	2.0945	0.9645
1.9134	16.0	1792	2.0395	0.9630
1.9443	17.0	1904	2.0017	0.9401
1.8988	18.0	2016	1.9514	0.9493
1.8141	19.0	2128	1.9111	0.9475
1.8344	20.0	2240	1.8790	0.9395
1.7775	21.0	2352	1.8616	0.9503
1.7517	22.0	2464	1.8333	0.9433
1.7037	23.0	2576	1.8156	0.9372
1.7158	24.0	2688	1.7961	0.9482
1.7111	25.0	2800	1.7817	0.9422
1.69	26.0	2912	1.7819	0.9430
1.6889	27.0	3024	1.7721	0.9386
1.6546	28.0	3136	1.7647	0.9453
1.6542	29.0	3248	1.7653	0.9375
1.647	30.0	3360	1.7583	0.9386

Framework versions

Transformers 4.21.0.dev0
Pytorch 1.9.1+cu102
Datasets 2.3.3.dev0
Tokenizers 0.12.1

📄 License

This model is licensed under the Apache - 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご