W

Wav2vec2 Large Ru Golos

Developed by bond005
A Russian speech recognition model fine-tuned from facebook/wav2vec2-large-xlsr-53, trained on the Sberdevices Golos dataset, supporting 16kHz audio input
Downloads 1,182
Release Time : 6/21/2022

Model Overview

This model is an optimized automatic speech recognition (ASR) system for Russian, enhanced with techniques like pitch shifting, speed adjustment, and reverberation to improve recognition accuracy across various Russian speech scenarios

Model Features

Russian Optimization
Specially fine-tuned for Russian phonetic characteristics, demonstrating excellent performance across multiple Russian test sets
Audio Enhancement
Incorporates training techniques like pitch shifting, speed adjustment, and reverberation to enhance model robustness
Multi-Scenario Adaptation
Performs well in both close-range (crowd) and far-field speech scenarios

Model Capabilities

Russian speech-to-text
16kHz audio processing
Far-field speech recognition

Use Cases

Speech Transcription
Russian Speech Transcription
Convert Russian speech content into text
Achieves WER 10.144% on the Golos crowd test set
Smart Assistants
Russian Voice Command Recognition
Used for voice command recognition in Russian smart home devices
Achieves WER 20.353% in far-field scenarios
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase