wav2vec2-large-xls-r-300m-bas-v1 Open Source Model - Free Deployment for Automatic Basaa Speech Recognition

Wav2vec2 Large Xls R 300m Bas V1

Developed by DrishtiSharma

This is an automatic speech recognition model fine-tuned on the Basque language (MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - BAS dataset) based on the facebook/wav2vec2-xls-r-300m model.

Speech Recognition

Transformers

OtherOpen Source License:Apache-2.0 #Basque speech recognition #Low word error rate #Multi-scenario adaptation

Downloads 23

Release Time : 3/2/2022

Model Overview

This model is specifically designed for automatic speech recognition tasks in Basque, achieving a word error rate (WER) of 35.66% and a character error rate (CER) of 11.03% on the Common Voice 8 test set.

Model Features

Basque speech recognition

Speech recognition capabilities optimized specifically for Basque

Based on XLS-R architecture

Uses facebook's wav2vec2-xls-r-300m pretrained model as the foundation

Fine-tuned on Common Voice dataset

Fine-tuned with Basque language data from MOZILLA-FOUNDATION/COMMON_VOICE_8_0

Model Capabilities

Basque speech-to-text

Automatic speech recognition

Use Cases

Speech transcription

Basque speech transcription

Convert Basque speech into text

Achieved WER of 35.66% and CER of 11.03% on the test set

Voice assistant

Basque voice assistant

Voice assistant applications supporting Basque language interaction

🚀 wav2vec2-large-xls-r-300m-bas-v1

This model is a fine - tuned version of facebook/wav2vec2-xls-r-300m on the MOZILLA - FOUNDATION/COMMON_VOICE_8_0 - BAS dataset. It is designed for automatic speech recognition tasks and offers robust performance on relevant datasets.

✨ Features

Language Support: Supports the Basaa (bas) language.
Fine - Tuned Model: Based on the pre - trained facebook/wav2vec2-xls-r-300m model, fine - tuned on the MOZILLA - FOUNDATION/COMMON_VOICE_8_0 - BAS dataset.
Performance Metrics: Achieves specific WER and CER scores on evaluation sets, indicating its effectiveness in speech recognition.

📦 Installation

No installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

No code examples are provided in the original document, so this section is skipped.

📚 Documentation

Evaluation Commands

To evaluate on mozilla - foundation/common_voice_8_0 with test split

python eval.py --model_id DrishtiSharma/wav2vec2-large-xls-r-300m-bas-v1 --dataset mozilla-foundation/common_voice_8_0 --config bas --split test --log_outputs

To evaluate on speech - recognition - community - v2/dev_data Basaa (bas) language isn't available in speech - recognition - community - v2/dev_data

Training hyperparameters

The following hyperparameters were used during training:

Property	Details
learning_rate	0.000111
train_batch_size	16
eval_batch_size	8
seed	42
gradient_accumulation_steps	2
total_train_batch_size	32
optimizer	Adam with betas=(0.9,0.999) and epsilon=1e - 08
lr_scheduler_type	linear
lr_scheduler_warmup_steps	500
num_epochs	100
mixed_precision_training	Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
12.7076	5.26	200	3.6361	1.0
3.1657	10.52	400	3.0101	1.0
2.3987	15.78	600	0.9125	0.6774
1.0079	21.05	800	0.6477	0.5352
0.7392	26.31	1000	0.5432	0.4929
0.6114	31.57	1200	0.5498	0.4639
0.5222	36.83	1400	0.5220	0.4561
0.4648	42.1	1600	0.5586	0.4289
0.4103	47.36	1800	0.5337	0.4082
0.3692	52.62	2000	0.5421	0.3861
0.3403	57.88	2200	0.5549	0.4096
0.3011	63.16	2400	0.5833	0.3925
0.2932	68.42	2600	0.5674	0.3815
0.2696	73.68	2800	0.5734	0.3889
0.2496	78.94	3000	0.5968	0.3985
0.2289	84.21	3200	0.5888	0.3893
0.2091	89.47	3400	0.5849	0.3852
0.2005	94.73	3600	0.5938	0.3875
0.1876	99.99	3800	0.5997	0.3870

Framework versions

Transformers 4.16.1
Pytorch 1.10.0+cu111
Datasets 1.18.2
Tokenizers 0.11.0

🔧 Technical Details

This model is a fine - tuned version of facebook/wav2vec2-xls-r-300m on the MOZILLA - FOUNDATION/COMMON_VOICE_8_0 - BAS dataset. It is used for automatic speech recognition tasks. During training, specific hyperparameters were set, and the model was evaluated on different datasets to obtain performance metrics such as WER and CER.

📄 License

The model is released under the Apache - 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご