W

Wav2vec2 Large 100k Voxpopuli Ft Common Voice Plus TTS Dataset Plus Data Augmentation Russian

Developed by Edresson
A Russian speech recognition model fine-tuned on Facebook's Wav2vec2 Large 100k Voxpopuli model using Common Voice 7.0, M-AILABS datasets, and data augmentation techniques.
Downloads 23
Release Time : 3/2/2022

Model Overview

This model is an automatic speech recognition (ASR) system specifically optimized for Russian, capable of converting Russian speech into text.

Model Features

Multi-dataset fine-tuning
Trained using Common Voice 7.0 and M-AILABS datasets, improving model recognition accuracy.
Data augmentation techniques
Utilizes TTS and voice conversion-based data augmentation methods to enhance model generalization.
Russian optimization
Specifically optimized for Russian speech characteristics, excelling in Russian recognition tasks.

Model Capabilities

Russian speech recognition
Speech-to-text
Automatic speech recognition

Use Cases

Speech transcription
Russian speech transcription
Automatically converts Russian speech content into text
Achieves a 19.46% word error rate on the Common Voice 7.0 test set
Voice assistants
Russian voice command recognition
Used for voice command recognition in Russian voice assistants
Featured Recommended AI Models
ยฉ 2025AIbase