Wav2vec2 Large 100k Voxpopuli Ft Common Voice Plus TTS Dataset Plus Data Augmentation Russian
W
Wav2vec2 Large 100k Voxpopuli Ft Common Voice Plus TTS Dataset Plus Data Augmentation Russian
Developed by Edresson
A Russian speech recognition model fine-tuned on Facebook's Wav2vec2 Large 100k Voxpopuli model using Common Voice 7.0, M-AILABS datasets, and data augmentation techniques.
Downloads 23
Release Time : 3/2/2022
Model Overview
This model is an automatic speech recognition (ASR) system specifically optimized for Russian, capable of converting Russian speech into text.
Model Features
Multi-dataset fine-tuning
Trained using Common Voice 7.0 and M-AILABS datasets, improving model recognition accuracy.
Data augmentation techniques
Utilizes TTS and voice conversion-based data augmentation methods to enhance model generalization.
Russian optimization
Specifically optimized for Russian speech characteristics, excelling in Russian recognition tasks.
Model Capabilities
Russian speech recognition
Speech-to-text
Automatic speech recognition
Use Cases
Speech transcription
Russian speech transcription
Automatically converts Russian speech content into text
Achieves a 19.46% word error rate on the Common Voice 7.0 test set
Voice assistants
Russian voice command recognition
Used for voice command recognition in Russian voice assistants
Featured Recommended AI Models
ยฉ 2025AIbase