đ Akashpb13/xlsr_kurmanji_kurdish
This model is designed for automatic speech recognition, specifically tailored for the Kurmanji Kurdish language. It offers a reliable solution for transcribing speech in this language, leveraging fine - tuning on relevant datasets.
đ Quick Start
Evaluation
To evaluate the model on mozilla - foundation/common_voice_8_0
with the test
split, you can use the following command:
python eval.py --model_id Akashpb13/xlsr_kurmanji_kurdish --dataset mozilla-foundation/common_voice_8_0 --config kmr --split test
⨠Features
- Fine - Tuned Model: It is a fine - tuned version of [facebook/wav2vec2 - xls - r - 300m](https://huggingface.co/facebook/wav2vec2 - xls - r - 300m) on the MOZILLA - FOUNDATION/COMMON_VOICE_7_0 - hu dataset.
- Multilingual Support: Supports languages like kmr and ku.
- Good Performance: Achieves relatively low WER and CER on evaluation datasets.
đĻ Installation
No specific installation steps are provided in the original README.
đ Documentation
Model description
The base model "facebook/wav2vec2 - xls - r - 300m" was fine - tuned to adapt to the Kurmanji Kurdish language.
Intended uses & limitations
More information about the intended uses and limitations is yet to be provided.
Training and evaluation data
The training data consists of Common voice Kurmanji Kurdish train.tsv, dev.tsv, invalidated.tsv, reported.tsv, and other.tsv. Only data points where upvotes were greater than downvotes were considered, and duplicates were removed after concatenating all the datasets from Common Voice 7.0.
Training procedure
For creating the training dataset, all possible datasets were appended, and a 90 - 10 split was used.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.000096
- train_batch_size: 16
- eval_batch_size: 16
- seed: 13
- gradient_accumulation_steps: 16
- lr_scheduler_type: cosine_with_restarts
- lr_scheduler_warmup_steps: 200
- num_epochs: 100
- mixed_precision_training: Native AMP
Training results
Step |
Training Loss |
Validation Loss |
Wer |
200 |
4.382500 |
3.183725 |
1.000000 |
400 |
2.870200 |
0.996664 |
0.781117 |
600 |
0.609900 |
0.333755 |
0.445052 |
800 |
0.326800 |
0.305729 |
0.403157 |
1000 |
0.255000 |
0.290734 |
0.391621 |
1200 |
0.226300 |
0.292389 |
0.388585 |
Framework versions
- Transformers 4.16.0.dev0
- Pytorch 1.10.0+cu102
- Datasets 1.18.1
- Tokenizers 0.10.3
Model Index
Task |
Dataset |
Metrics (WER) |
Metrics (CER) |
Automatic Speech Recognition |
Common Voice 8 (kmr) |
0.33073206986250464 |
0.08035244447163924 |
Automatic Speech Recognition |
Robust Speech Event - Dev Data (kmr) |
0.33073206986250464 |
0.08035244447163924 |
đ License
This model is licensed under the Apache 2.0 license.