đ Wav2Vec2-XLS-R-300M for Punjabi (pa-IN)
This model is a fine - tuned version of facebook/wav2vec2-xls-r-300m on the Punjabi (pa - IN) subset of the Mozilla Foundation's Common Voice 8.0 dataset. It offers high - quality automatic speech recognition capabilities for the Punjabi language.
đ Quick Start
Evaluation
To evaluate the model, you can use the following commands:
Evaluate on mozilla - foundation/common_voice_8_0 with test split
python eval.py --model_id DrishtiSharma/wav2vec2-xls-r-300m-pa-IN-r5 --dataset mozilla-foundation/common_voice_8_0 --config pa-IN --split test --log_outputs
Evaluate on speech - recognition - community - v2/dev_data
Note that the Punjabi language isn't available in speech - recognition - community - v2/dev_data.
⨠Features
- Fine - Tuned for Punjabi: Specifically trained on the Punjabi subset of the Common Voice 8.0 dataset, ensuring better performance for Punjabi speech recognition.
- High - Quality Results: Achieves a Test WER of 0.4186593492747942 and a Test CER of 0.13301322550753938 on the Common Voice 8 dataset.
đĻ Installation
No specific installation steps are provided in the original README.
đģ Usage Examples
Basic Usage
The basic usage involves using the evaluation commands as shown above to test the model's performance on different datasets.
đ Documentation
Model Information
Property |
Details |
Model Type |
Fine - tuned wav2vec2 - xls - r - 300m for Punjabi (pa - IN) |
Training Data |
mozilla - foundation/common_voice_8_0 (pa - IN subset) |
Evaluation Results
This model achieves the following results on the evaluation set:
Training Hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.000111
- train_batch_size: 16
- eval_batch_size: 32
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9, 0.999) and epsilon = 1e - 08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 2000
- num_epochs: 200.0
- mixed_precision_training: Native AMP
Training Results
Training Loss |
Epoch |
Step |
Validation Loss |
Wer |
10.695 |
18.52 |
500 |
3.5681 |
1.0 |
3.2718 |
37.04 |
1000 |
2.3081 |
0.9643 |
0.8727 |
55.56 |
1500 |
0.7227 |
0.5147 |
0.3349 |
74.07 |
2000 |
0.7498 |
0.4959 |
0.2134 |
92.59 |
2500 |
0.7779 |
0.4720 |
0.1445 |
111.11 |
3000 |
0.8120 |
0.4594 |
0.1057 |
129.63 |
3500 |
0.8225 |
0.4610 |
0.0826 |
148.15 |
4000 |
0.8307 |
0.4351 |
0.0639 |
166.67 |
4500 |
0.8967 |
0.4316 |
0.0528 |
185.19 |
5000 |
0.8875 |
0.4238 |
Framework Versions
- Transformers 4.17.0.dev0
- Pytorch 1.10.2+cu102
- Datasets 1.18.2.dev0
- Tokenizers 0.11.0
đ License
This model is licensed under the Apache 2.0 license.