W

Wav2vec2 Base Vietnamese 160h

Developed by khanhld
Vietnamese speech recognition model based on Wav2vec2, fine-tuned on 160 hours of Vietnamese speech data
Downloads 356
Release Time : 5/7/2022

Model Overview

This model is a Vietnamese automatic speech recognition (ASR) model based on the Wav2vec2 architecture, fine-tuned on approximately 160 hours of Vietnamese speech datasets, supporting Vietnamese speech-to-text tasks.

Model Features

Multi-dataset training
The model was trained on multiple Vietnamese speech datasets including VIVOS, COMMON VOICE, FOSD, and VLSP
No language model support
Achieves good recognition results even without an integrated language model
Open-source implementation
Provides complete pre-training and fine-tuning code, supporting custom dataset training

Model Capabilities

Vietnamese speech recognition
Audio-to-text conversion
Speech transcription

Use Cases

Speech transcription
Vietnamese speech transcription
Convert Vietnamese speech content into text
Achieved a WER of 10.78% on the Common Voice Vietnamese test set
Voice assistants
Vietnamese voice command recognition
Used as the front-end speech recognition module for Vietnamese voice assistants
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase