W

Wav2vec2 Large Xls R 300m Ur

Developed by anuragshas
Urdu speech recognition model based on the wav2vec2-large-xls-r-300m architecture, fine-tuned on the Common Voice dataset
Downloads 20
Release Time : 3/2/2022

Model Overview

This model is an automatic speech recognition (ASR) system optimized for Urdu, based on Facebook's wav2vec2 architecture and fine-tuned on the Common Voice dataset.

Model Features

Large-scale pre-training
Based on the 300M-parameter wav2vec2-large-xls-r architecture with powerful speech feature extraction capabilities
Urdu optimization
Specially fine-tuned for Urdu to adapt to the language's specific phonetic characteristics
Open-source license
Released under Apache 2.0 license, allowing both commercial and research use

Model Capabilities

Urdu speech-to-text
Continuous speech recognition
Voice activity detection

Use Cases

Speech transcription
Urdu media content transcription
Automatically transcribe Urdu podcasts, videos, and other content into text
Achieved a word error rate of 0.7328 on the evaluation set
Assistive technology
Voice-controlled applications
Develop voice-controlled interfaces for Urdu-speaking users
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase