Viwav2vec2 Base 1.5k
This model is pretrained on 1.5k hours of Vietnamese speech data, suitable for Vietnamese speech recognition tasks, requires fine-tuning before use.
Downloads 38
Release Time : 5/3/2022
Model Overview
Vietnamese speech pretrained model based on Wav2Vec2 architecture, trained on 1.5k hours of read and broadcast speech data, supports 16kHz sampled speech input.
Model Features
Large-scale Vietnamese pretraining
Pretrained on 1.5k hours of Vietnamese speech data, covering read and broadcast speech
16kHz sampling support
Optimized for 16kHz sampled speech data, ensure input speech matches this sampling rate
Requires fine-tuning
The model needs fine-tuning on downstream tasks (e.g. Vietnamese ASR) for optimal performance
Model Capabilities
Vietnamese speech feature extraction
Speech representation learning
Use Cases
Speech technology
Vietnamese speech recognition system
Build Vietnamese ASR system by fine-tuning the model
Speech analysis
For Vietnamese speech feature analysis and representation learning
Featured Recommended AI Models
Š 2025AIbase