W

Wav2vec2 Large Xlsr 53 Kalmyk

Developed by tugstugi
This is a Kalmyk automatic speech recognition model based on the Wav2Vec2 architecture, pre-trained and fine-tuned to support Kalmyk speech-to-text tasks.
Downloads 79
Release Time : 3/2/2022

Model Overview

The model was initially pre-trained on 500 hours of Kalmyk TV recordings and 1000 hours of Mongolian speech datasets, then fine-tuned with 300 hours of Kalmyk synthetic speech data, making it suitable for Kalmyk speech recognition.

Model Features

Multi-stage Training
Pre-trained on extensive Kalmyk and Mongolian data first, then fine-tuned with synthetic speech data to improve recognition performance.
Synthetic Data Augmentation
Fine-tuned using 300 hours of Kalmyk synthetic speech data to enhance the model's recognition capability for Kalmyk.
Cross-lingual Transfer
Leveraging Mongolian data for pre-training may help improve recognition performance for related languages.

Model Capabilities

Kalmyk speech recognition
Speech-to-text

Use Cases

Speech Transcription
Kalmyk TV Program Transcription
Automatically transcribing Kalmyk TV program content into text
Word Error Rate (WER) of 50% on a private test set
Clear Speech Recognition
Recognizing clearly pronounced Kalmyk speech
Word Error Rate should be significantly lower than 50%
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase