W

Wav2vec2 Base Vn 270h

Developed by dragonSwing
A speech recognition model fine-tuned with approximately 270 hours of Vietnamese annotated data, supporting Vietnamese automatic speech recognition tasks
Downloads 202
Release Time : 3/2/2022

Model Overview

This model is a Vietnamese automatic speech recognition (ASR) model based on the Wav2Vec2 architecture, fine-tuned using annotated speech data from datasets such as Common Voice, VIVOS, and VLSP2020, totaling approximately 270 hours.

Model Features

Multi-dataset Training
Integrated multiple Vietnamese speech datasets including Common Voice, VIVOS, and VLSP2020 for training
Low Word Error Rate
Achieved a WER of 3.70% on the VIVOS test set, demonstrating excellent performance
Language Model Support
Can be used with a 4-gram language model to significantly improve recognition accuracy

Model Capabilities

Vietnamese speech recognition
Audio-to-text conversion
16kHz sampling rate speech processing

Use Cases

Speech Transcription
Vietnamese Meeting Minutes
Automatically convert Vietnamese meeting recordings into text transcripts
Accuracy exceeds 90%
Voice Assistant
Provide speech recognition capabilities for Vietnamese voice assistants
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase