Fine Tune XLSR Wav2Vec2 Speech2Text Vietnamese
F
Fine Tune XLSR Wav2Vec2 Speech2Text Vietnamese
Developed by leduytan93
This is a Vietnamese automatic speech recognition (ASR) repair model based on the MT5 architecture, fine-tuned for Vietnamese speech recognition tasks.
Downloads 25
Release Time : 3/2/2022
Model Overview
This model is primarily used for Vietnamese automatic speech recognition tasks, capable of converting Vietnamese speech into text. The model was fine-tuned on the Common Voice Vietnamese dataset, achieving a word error rate (WER) of 25.2%.
Model Features
Vietnamese speech recognition
Speech recognition capabilities optimized specifically for Vietnamese
Based on MT5 architecture
Utilizes the MT5 model architecture for speech recognition tasks
Fine-tuned on Common Voice
Fine-tuned using the Common Voice Vietnamese dataset
Model Capabilities
Vietnamese speech recognition
Speech-to-text
Use Cases
Speech transcription
Vietnamese speech transcription
Convert Vietnamese speech content into text
Word error rate 25.2%
Voice assistants
Vietnamese voice assistant
Used for building Vietnamese voice assistant systems
Featured Recommended AI Models
Š 2025AIbase