A

Asr Whisper Large V2 Commonvoice Fa

Developed by speechbrain
This is an automatic speech recognition model based on the whisper-large-v2 architecture, specifically fine-tuned for Persian on the CommonVoice dataset.
Downloads 103
Release Time : 1/30/2023

Model Overview

This model is used for automatic speech recognition tasks in Persian, employing the whisper encoder-decoder architecture and fine-tuned on the CommonVoice Persian dataset.

Model Features

High-performance Persian recognition
Achieves 31.75% Word Error Rate (WER) and 9.38% Character Error Rate (CER) on the CommonVoice Persian test set
Based on pre-trained model
Uses the pre-trained whisper-large-v2 model as the base, with the encoder part kept frozen
End-to-end training
The entire system is trained end-to-end, simplifying the speech recognition process

Model Capabilities

Persian speech recognition
16kHz audio processing
Automatic audio normalization

Use Cases

Speech transcription
Persian speech transcription
Convert Persian speech content into text
Achieves 31.75% word error rate on the test set
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase