Wav2vec2 Large Xlsr 53 Kalmyk
This is a Kalmyk automatic speech recognition model based on the Wav2Vec2 architecture, pre-trained and fine-tuned to support Kalmyk speech-to-text tasks.
Downloads 79
Release Time : 3/2/2022
Model Overview
The model was initially pre-trained on 500 hours of Kalmyk TV recordings and 1000 hours of Mongolian speech datasets, then fine-tuned with 300 hours of Kalmyk synthetic speech data, making it suitable for Kalmyk speech recognition.
Model Features
Multi-stage Training
Pre-trained on extensive Kalmyk and Mongolian data first, then fine-tuned with synthetic speech data to improve recognition performance.
Synthetic Data Augmentation
Fine-tuned using 300 hours of Kalmyk synthetic speech data to enhance the model's recognition capability for Kalmyk.
Cross-lingual Transfer
Leveraging Mongolian data for pre-training may help improve recognition performance for related languages.
Model Capabilities
Kalmyk speech recognition
Speech-to-text
Use Cases
Speech Transcription
Kalmyk TV Program Transcription
Automatically transcribing Kalmyk TV program content into text
Word Error Rate (WER) of 50% on a private test set
Clear Speech Recognition
Recognizing clearly pronounced Kalmyk speech
Word Error Rate should be significantly lower than 50%
Featured Recommended AI Models
Š 2025AIbase