W

Wav2vec2 Large Xlsr 53 Persian

Developed by jonatasgrosman
XLSR-53 large model speech recognition system optimized for Persian, fine-tuned based on facebook/wav2vec2-large-xlsr-53 architecture
Downloads 257.76k
Release Time : 3/2/2022

Model Overview

This model is a Persian speech recognition system optimized based on the XLSR-53 architecture, trained using the Common Voice 6.1 Persian dataset, suitable for Persian speech-to-text tasks.

Model Features

High-performance Persian recognition
Achieves 30.12% word error rate and 7.37% character error rate on the Common Voice Persian test set
Based on XLSR-53 architecture
Utilizes the large-scale self-supervised pre-trained XLSR-53 model for fine-tuning
16kHz sampling rate support
Optimized for 16kHz sampling rate voice input

Model Capabilities

Persian speech recognition
Speech-to-text
Audio transcription

Use Cases

Speech transcription
Persian speech-to-text
Convert Persian speech content into text format
Achieves 30.12% word error rate on the Common Voice test set
Voice assistants
Persian voice command recognition
For understanding voice commands in Persian voice assistants
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase