Viwav2vec2 Base 3k
This model is a Wav2Vec2 base model pre-trained on 3,000 hours of Vietnamese speech data, suitable for Vietnamese speech recognition tasks, and requires fine-tuning on downstream tasks for use.
Downloads 41
Release Time : 5/3/2022
Model Overview
This is a Wav2Vec2 base model pre-trained on 3,000 hours of Vietnamese speech data, including spontaneous conversations, read speech, and broadcast audio. The model requires fine-tuning on downstream tasks (such as Vietnamese automatic speech recognition) to achieve optimal performance.
Model Features
Large-scale Vietnamese pre-training
Pre-trained on 3,000 hours of Vietnamese speech data, covering various speech types
16kHz sampling rate support
Optimized for 16kHz sampled speech data; ensure input data matches this sampling rate
Requires downstream fine-tuning
The model needs fine-tuning on downstream tasks (e.g., speech recognition) to achieve optimal performance
Model Capabilities
Vietnamese speech feature extraction
Speech representation learning
Use Cases
Speech technology
Vietnamese speech recognition system
Build a Vietnamese automatic speech recognition system by fine-tuning the model
Speech analysis applications
Used for Vietnamese speech content analysis
Featured Recommended AI Models
Š 2025AIbase