A

Asr Wav2vec2 Dvoice Darija

Developed by speechbrain
This is an automatic speech recognition model for the Moroccan Arabic dialect (Darija), fine-tuned on the DVoice dataset based on the wav2vec 2.0 architecture.
Downloads 120
Release Time : 6/9/2022

Model Overview

This model provides end-to-end Darija speech transcription. It uses a pre-trained wav2vec 2.0 model as the basis, adds DNN layers, and fine-tunes on the Darija dataset. Finally, it outputs text results through a CTC greedy decoder.

Model Features

Support for low-resource languages
Optimized specifically for the resource-scarce Darija dialect, addressing the data shortage problem through transfer learning techniques
Community-driven data
Trained on real community recording data collected from the DVoice platform, reflecting real language usage scenarios
Efficient fine-tuning architecture
Based on the pre-trained wav2vec2-large-xlsr-53 model, only two DNN layers are added for fine-tuning to achieve efficient training

Model Capabilities

Moroccan Arabic dialect speech recognition
16kHz mono audio processing
Automatic audio standardization (resampling/mono selection)

Use Cases

Speech transcription
Transcription of dialect media content
Automatically convert content such as podcasts and videos in Moroccan dialect into text
WER 18.28%, CER 5.85% on the test set
Voice assistant
Recognition of dialect voice commands
Provide a voice assistant with dialect interaction for users in the Moroccan region
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase