đ A Robust Transformer Model for Arabic Dialect Identification (ADI) in Speech
This project presents an accurate and robust Transformer - based model for Arabic Dialect Identification (ADI) in speech. By fine - tuning the pre - trained MMS model on diverse Arabic TV broadcast speech, it can identify Modern Standard Arabic and four major Arabic dialects. You can try the model via this Hugging Face space.
đ Quick Start
The model can identify the following Arabic dialects/varieties:
- Modern Standard Arabic (MSA)
- Egyptian Arabic (Masri and Sudani)
- Gulf Arabic (Khleeji, Iraqi, and Yemeni)
- Levantine Arabic (Shami)
- Maghrebi Arabic (Dialects of Al - Maghreb Al - Arabi in North Africa)
⨠Features
- High Accuracy: The model has been tested and evaluated on different datasets with various challenges like background noise, channel mismatch, and emotional tone in speech, and it performed very well.
- Versatile Use: It can be used in a large - scale speech data collection pipeline to create resources for different Arabic dialects and filter speech data for Modern Standard Arabic to develop text - to - speech (TTS) systems.
đĻ Installation
There is no specific installation process mentioned in the original document.
đģ Usage Examples
Basic Usage
from transformers import pipeline
model_id = "badrex/mms-300m-arabic-dialect-identifier"
adi5_classifier = pipeline(
"audio-classification",
model=model_id,
device='cpu'
)
audio_path = "./samples/arabic_audio_sample.mp3"
predictions = adi5_classifier(audio_path)
for pred in predictions:
print(f"Dialect: {pred['label']:<10} Confidence: {pred['score']:.4f}")
đ Documentation
Info
Property |
Details |
Developed by |
Badr M. Abdullah and Matthew Baas |
Model Type |
wav2vec 2.0 architecture |
Language |
Arabic (and its varieties) |
License |
Creative Commons Attribution 4.0 (CC BY 4.0) |
Finetuned from model |
MMS - 300m [https://huggingface.co/facebook/mms - 300m] |
Training Data
TV Broadcast speech (news, interviews, discussion, TV shows, etc.)
Evaluation
The model has been tested on datasets with challenges such as background noise, channel mismatch, and emotional tone in speech. It performed well and is expected to be robust to real - world speech samples.
Uses
The model can be used as a component in a large - scale speech data collection pipeline to create resources for different Arabic dialects. It can also be used to filter speech data for Modern Standard Arabic for text - to - speech (TTS) system development.
Out - of - Scope Use
The model should not be used to:
- Assess fluency or nativeness of speech
- Determine whether the speaker uses a formal or informal register
- Make judgments about a speaker's origin, education level, or socioeconomic status
- Filter or discriminate against speakers based on dialect
Bias, Risks, and Limitations
Some Arabic varieties are not well - represented in the training data. The model may not work well for some dialects like Yemeni Arabic, Iraqi Arabic, and Saharan Arabic. Additional limitations include:
- Very short audio samples (< 2 seconds) may not provide enough information for accurate classification.
- Code - switching between dialects (especially mixing with MSA) may result in less reliable classifications.
- Speakers who have lived in multiple dialect regions may exhibit mixed features.
- Speech from non - typical speakers such as children and people with speech disorders might be challenging for the model.
Recommendations
â ī¸ Important Note
Some Arabic varieties are not well - represented in the training data, and the model may have limitations for certain dialects.
đĄ Usage Tip
- For optimal results, use audio segments of at least 5 - 10 seconds.
- Confidence scores may not always be informative.
- For critical applications, consider human verification of model predictions.
đ§ Technical Details
The model is based on the wav2vec 2.0 architecture and is fine - tuned on the pre - trained MMS - 300m model. It is trained on diverse Arabic TV broadcast speech data to identify different Arabic dialects.
đ License
The model is licensed under Creative Commons Attribution 4.0 (CC BY 4.0).
Citation
BibTeX:
@misc{abdullah2025arabicadi,
author = {Abdullah, Badr M. and Baas, Matthew},
title = {A Robust Transformer Model for Arabic Dialect Identification in Speech},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/badrex/mms-300m-arabic-dialect-identifier}}
}
APA:
Abdullah, B. M., & Baas, M. (2025). A Robust Transformer Model for Arabic Dialect Identification in Speech. Retrieved from https://huggingface.co/badrex/mms - 300m - arabic - dialect - identifier/
Model Card Contact
If you have any question, please do not hesitate to write an email to badr dot nlp at gmail dot com đ