๐ Whisper Large v3 - Persian (Common Voice 17)
This model is a fine - tuned version of Whisper Large v3 on the Common Voice 17 dataset, significantly improving the accuracy of Persian automatic speech recognition.
๐ Quick Start
Whisper Large v3 has been fine - tuned on Common Voice 17, leveraging over 250,000 Persian audio samplesโa significant improvement over earlier models trained on Common Voice 11, which contained only 83,000 samples. This larger dataset has resulted in a lower Word Error Rate (WER), enhancing the model's accuracy and robustness in recognizing Persian speech.
This update marks a major step forward in Persian ASR, and we hope it benefits the Persian - speaking community, making high - quality speech recognition more accessible and reliable. ๐
๐ฆ Installation
No specific installation steps are provided in the original README. So, this section is skipped.
๐ป Usage Examples
Basic Usage
from transformers import pipeline
asr_pipe = pipeline(
"automatic-speech-recognition",
model="MohammadGholizadeh/whisper-large-v3-persian-common-voice-17",
chunk_length_s=30
)
text = asr_pipe("your_file")["text"]
print(text)
๐ Documentation
Property |
Details |
Model Name |
Whisper Large v3 - Persian (Common Voice 17) |
Base Model |
Whisper Large v3 |
Language |
Persian (Farsi) |
Dataset |
Mozilla Common Voice 17 (Persian subset) |
Hardware Used |
NVIDIA A100 GPU |
Batch Size |
16 |
Training Steps |
5000 |
WER (Word Error Rate) |
21.43 |
๐ง Technical Details
No specific technical details (more than 50 words of detailed technical description) are provided in the original README. So, this section is skipped.
๐ License
The model is licensed under the Apache - 2.0 license.
๐ Notes
โ ๏ธ Important Note
Since the fine - tuning process does not include any timestamps, the model cannot return any timestamps. Even when you are trying to return it, you would encounter an Error.
The solution is to chunk audio files into smaller chunks. Further fine - tuning would definitely increase the accuracy of the model. We are currently looking for sponsorships for Hardware and ASR dataset collaborations.
BibTeX Citation
@misc{whisper_persian_cv17,
author = {Mohammad Sadegh Gholizadeh},
title = {Whisper Large v3 - Persian (Common Voice 17)},
year = {2025},
url = {https://huggingface.co/msghol/whisper-large-v3-persian-common-voice-17}
}