A

Asr Whisper Medium Commonvoice Fa

Developed by speechbrain
A fine-tuned whisper medium model based on the CommonVoice-14.0 Persian dataset for Persian automatic speech recognition tasks.
Downloads 21
Release Time : 7/20/2023

Model Overview

This model is an automatic speech recognition system based on the whisper-medium architecture, specifically optimized for Persian, capable of converting Persian speech audio into text.

Model Features

Fine-tuning of pre-trained model
Fine-tuned on Persian data based on the pre-trained whisper-medium model, retaining the powerful feature extraction ability of the original model
Efficient training
Froze the pre-trained whisper encoder and only fine-tuned the decoder part, improving training efficiency
Automatic audio processing
Built-in audio normalization processing, including automatic resampling and mono selection

Model Capabilities

Persian speech recognition
Audio transcription
Speech-to-text

Use Cases

Speech transcription
Persian speech-to-text
Convert Persian audio files into text format
Achieved a word error rate of 35.48% on the CommonVoice test set
Voice assistant
Persian voice command recognition
Used as the basic recognition module for building a Persian voice assistant
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase