wav2vec2-large-xlsr-53-kalmyk Open Source Model - Supports Precise Conversion of Kalmyk Speech to Text

Wav2vec2 Large Xlsr 53 Kalmyk

Developed by tugstugi

This is a Kalmyk automatic speech recognition model based on the Wav2Vec2 architecture, pre-trained and fine-tuned to support Kalmyk speech-to-text tasks.

Speech Recognition

Transformers

OtherOpen Source License:Apache-2.0 #Kalmyk ASR #Synthetic speech fine-tuning #Low-resource language

Downloads 79

Release Time : 3/2/2022

Model Overview

The model was initially pre-trained on 500 hours of Kalmyk TV recordings and 1000 hours of Mongolian speech datasets, then fine-tuned with 300 hours of Kalmyk synthetic speech data, making it suitable for Kalmyk speech recognition.

Model Features

Multi-stage Training

Pre-trained on extensive Kalmyk and Mongolian data first, then fine-tuned with synthetic speech data to improve recognition performance.

Synthetic Data Augmentation

Fine-tuned using 300 hours of Kalmyk synthetic speech data to enhance the model's recognition capability for Kalmyk.

Cross-lingual Transfer

Leveraging Mongolian data for pre-training may help improve recognition performance for related languages.

Model Capabilities

Kalmyk speech recognition

Speech-to-text

Use Cases

Speech Transcription

Kalmyk TV Program Transcription

Automatically transcribing Kalmyk TV program content into text

Word Error Rate (WER) of 50% on a private test set

Clear Speech Recognition

Recognizing clearly pronounced Kalmyk speech

Word Error Rate should be significantly lower than 50%

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Wav2vec2 Large Xlsr 53 Kalmyk

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Wav2Vec2 Model for Kalmyk Speech Recognition

🚀 Quick Start

✨ Features

🔧 Technical Details

Voice Conversion Info

📄 License