Open-source Urdu Speech Recognition Model wav2vec2-large-xls-r-300m-ur: Accurate Recognition for Smooth Communication

Wav2vec2 Large Xls R 300m Ur

Developed by anuragshas

Urdu speech recognition model based on the wav2vec2-large-xls-r-300m architecture, fine-tuned on the Common Voice dataset

Speech Recognition

Transformers

Open Source License:Apache-2.0 #Urdu speech recognition #Large model fine-tuning #Low-resource language

Downloads 20

Release Time : 3/2/2022

Model Overview

This model is an automatic speech recognition (ASR) system optimized for Urdu, based on Facebook's wav2vec2 architecture and fine-tuned on the Common Voice dataset.

Model Features

Large-scale pre-training

Based on the 300M-parameter wav2vec2-large-xls-r architecture with powerful speech feature extraction capabilities

Urdu optimization

Specially fine-tuned for Urdu to adapt to the language's specific phonetic characteristics

Open-source license

Released under Apache 2.0 license, allowing both commercial and research use

Model Capabilities

Urdu speech-to-text

Continuous speech recognition

Voice activity detection

Use Cases

Speech transcription

Urdu media content transcription

Automatically transcribe Urdu podcasts, videos, and other content into text

Achieved a word error rate of 0.7328 on the evaluation set

Assistive technology

Voice-controlled applications

Develop voice-controlled interfaces for Urdu-speaking users

Training Loss	Epoch	Step	Validation Loss	Wer
0.0719	66.67	400	1.8510	0.7432
0.0284	133.33	800	2.0088	0.7415
0.014	200.0	1200	2.0508	0.7328

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Wav2vec2 Large Xls R 300m Ur

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 wav2vec2-large-xls-r-300m-ur

🚀 Quick Start

📚 Documentation

Model description

Intended uses & limitations

Training and evaluation data

🔧 Technical Details

Training procedure

Training hyperparameters

Training results

Framework versions

📄 License