X

Xls Asr Vi 40h

Developed by geninhu
This model is a speech recognition model fine-tuned on the Common Voice 7.0 Vietnamese dataset and private datasets based on facebook/wav2vec2-xls-r-300m.
Downloads 14
Release Time : 3/2/2022

Model Overview

This is an automatic speech recognition (ASR) model for Vietnamese, fine-tuned on the Common Voice 7.0 Vietnamese dataset and private datasets, suitable for Vietnamese speech-to-text tasks.

Model Features

Based on XLS-R architecture
Uses facebook's wav2vec2-xls-r-300m pre-trained model as the foundation, featuring powerful speech feature extraction capabilities.
Optimized for Vietnamese
Specially fine-tuned for Vietnamese, making it suitable for Vietnamese speech recognition tasks.
Trained on mixed datasets
Combines the Common Voice 7.0 public dataset and private datasets for training, potentially improving the model's generalization ability.

Model Capabilities

Vietnamese speech recognition
Automatic speech-to-text

Use Cases

Speech transcription
Vietnamese speech transcription
Converts Vietnamese speech content into text format
WER of 56.57 (including language model) on the Common Voice 7.0 test set
Voice assistants
Vietnamese voice command recognition
Used for the voice command recognition module in Vietnamese voice assistants
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase