Open-source Automatic Speech Recognition Model for xlsr_kurmanji_kurdish - Accurately Recognize Kurmanji Kurdish Speech

Xlsr Kurmanji Kurdish

Developed by Akashpb13

This model is an automatic speech recognition model fine-tuned on the Kurmanji Kurdish dataset based on facebook/wav2vec2-xls-r-300m.

Speech Recognition

Transformers

OtherOpen Source License:Apache-2.0 #Kurmanji dialect ASR #Low CER speech recognition #Multi-dialect robustness

Downloads 60

Release Time : 3/2/2022

Model Overview

This is an automatic speech recognition model optimized for Kurmanji Kurdish, fine-tuned on the wav2vec2-xls-r-300m architecture, performing well on the Common Voice dataset.

Model Features

Kurmanji Dialect Support

Speech recognition capabilities specifically optimized for the Kurmanji Kurdish dialect

Efficient Training

Utilizes mixed-precision training and cosine annealing learning rate scheduler to optimize the training process

Multi-dataset Integration

Integrates data from multiple subsets of Common Voice for training, enhancing model robustness

Model Capabilities

Kurmanji Kurdish speech recognition

Automatic speech-to-text

Multi-dialect support

Use Cases

Speech Transcription

Kurdish Speech Transcription

Convert speech content in the Kurmanji dialect to text

WER of 0.3307 on the Common Voice test set

Voice Assistants

Kurdish Voice Interaction

Provide voice control interfaces for Kurdish-speaking users

🚀 Akashpb13/xlsr_kurmanji_kurdish

This model is designed for automatic speech recognition, specifically tailored for the Kurmanji Kurdish language. It offers a reliable solution for transcribing speech in this language, leveraging fine - tuning on relevant datasets.

🚀 Quick Start

Evaluation

To evaluate the model on mozilla - foundation/common_voice_8_0 with the test split, you can use the following command:

python eval.py --model_id Akashpb13/xlsr_kurmanji_kurdish --dataset mozilla-foundation/common_voice_8_0 --config kmr --split test

✨ Features

Fine - Tuned Model: It is a fine - tuned version of [facebook/wav2vec2 - xls - r - 300m](https://huggingface.co/facebook/wav2vec2 - xls - r - 300m) on the MOZILLA - FOUNDATION/COMMON_VOICE_7_0 - hu dataset.
Multilingual Support: Supports languages like kmr and ku.
Good Performance: Achieves relatively low WER and CER on evaluation datasets.

📦 Installation

No specific installation steps are provided in the original README.

📚 Documentation

Model description

The base model "facebook/wav2vec2 - xls - r - 300m" was fine - tuned to adapt to the Kurmanji Kurdish language.

Intended uses & limitations

More information about the intended uses and limitations is yet to be provided.

Training and evaluation data

The training data consists of Common voice Kurmanji Kurdish train.tsv, dev.tsv, invalidated.tsv, reported.tsv, and other.tsv. Only data points where upvotes were greater than downvotes were considered, and duplicates were removed after concatenating all the datasets from Common Voice 7.0.

Training procedure

For creating the training dataset, all possible datasets were appended, and a 90 - 10 split was used.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.000096
train_batch_size: 16
eval_batch_size: 16
seed: 13
gradient_accumulation_steps: 16
lr_scheduler_type: cosine_with_restarts
lr_scheduler_warmup_steps: 200
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Step	Training Loss	Validation Loss	Wer
200	4.382500	3.183725	1.000000
400	2.870200	0.996664	0.781117
600	0.609900	0.333755	0.445052
800	0.326800	0.305729	0.403157
1000	0.255000	0.290734	0.391621
1200	0.226300	0.292389	0.388585

Framework versions

Transformers 4.16.0.dev0
Pytorch 1.10.0+cu102
Datasets 1.18.1
Tokenizers 0.10.3

Model Index

Task	Dataset	Metrics (WER)	Metrics (CER)
Automatic Speech Recognition	Common Voice 8 (kmr)	0.33073206986250464	0.08035244447163924
Automatic Speech Recognition	Robust Speech Event - Dev Data (kmr)	0.33073206986250464	0.08035244447163924

📄 License

This model is licensed under the Apache 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご