Wav2vec2 Large Xls R 300m Ur
Urdu speech recognition model based on the wav2vec2-large-xls-r-300m architecture, fine-tuned on the Common Voice dataset
Downloads 20
Release Time : 3/2/2022
Model Overview
This model is an automatic speech recognition (ASR) system optimized for Urdu, based on Facebook's wav2vec2 architecture and fine-tuned on the Common Voice dataset.
Model Features
Large-scale pre-training
Based on the 300M-parameter wav2vec2-large-xls-r architecture with powerful speech feature extraction capabilities
Urdu optimization
Specially fine-tuned for Urdu to adapt to the language's specific phonetic characteristics
Open-source license
Released under Apache 2.0 license, allowing both commercial and research use
Model Capabilities
Urdu speech-to-text
Continuous speech recognition
Voice activity detection
Use Cases
Speech transcription
Urdu media content transcription
Automatically transcribe Urdu podcasts, videos, and other content into text
Achieved a word error rate of 0.7328 on the evaluation set
Assistive technology
Voice-controlled applications
Develop voice-controlled interfaces for Urdu-speaking users
Featured Recommended AI Models
Š 2025AIbase