F

Fine Tune XLSR Wav2Vec2 Speech2Text Vietnamese

Developed by leduytan93
This is a Vietnamese automatic speech recognition (ASR) repair model based on the MT5 architecture, fine-tuned for Vietnamese speech recognition tasks.
Downloads 25
Release Time : 3/2/2022

Model Overview

This model is primarily used for Vietnamese automatic speech recognition tasks, capable of converting Vietnamese speech into text. The model was fine-tuned on the Common Voice Vietnamese dataset, achieving a word error rate (WER) of 25.2%.

Model Features

Vietnamese speech recognition
Speech recognition capabilities optimized specifically for Vietnamese
Based on MT5 architecture
Utilizes the MT5 model architecture for speech recognition tasks
Fine-tuned on Common Voice
Fine-tuned using the Common Voice Vietnamese dataset

Model Capabilities

Vietnamese speech recognition
Speech-to-text

Use Cases

Speech transcription
Vietnamese speech transcription
Convert Vietnamese speech content into text
Word error rate 25.2%
Voice assistants
Vietnamese voice assistant
Used for building Vietnamese voice assistant systems
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase