Wav2vec2 Base Sl Voxpopuli V2
This is a speech model based on Facebook's Wav2Vec2 architecture, specifically pretrained for Slovenian (sl) using 11.3k hours of unlabeled data from the VoxPopuli corpus.
Downloads 31
Release Time : 3/2/2022
Model Overview
This model is a foundational speech recognition model focused on learning Slovenian speech features. It extracts features from raw audio through self-supervised learning and can serve as a base model for speech recognition tasks.
Model Features
Specialized for Slovenian
Specifically pretrained for Slovenian, optimizing speech feature extraction capabilities for this language
Self-supervised learning
Uses 11.3k hours of unlabeled speech data for self-supervised pretraining
16kHz audio support
The model is optimized for 16kHz sampled audio; ensure input audio matches this sampling rate
Model Capabilities
Speech feature extraction
Speech recognition base model
Use Cases
Speech technology
Slovenian speech recognition system
Can serve as a base model for building Slovenian speech recognition systems through fine-tuning
Requires additional labeled data for fine-tuning to achieve optimal performance
Speech feature analysis
Used to extract feature representations of Slovenian speech
Featured Recommended AI Models
Š 2025AIbase