Fine-Tune-XLSR-Wav2Vec2-Speech2Text-Vietnamese Open-Source Model - Precise Repair of Vietnamese Speech Recognition Results

Fine Tune XLSR Wav2Vec2 Speech2Text Vietnamese

Developed by leduytan93

This is a Vietnamese automatic speech recognition (ASR) repair model based on the MT5 architecture, fine-tuned for Vietnamese speech recognition tasks.

Speech Recognition OtherOpen Source License:Apache-2.0 #Vietnamese speech recognition #XLSR fine-tuning #Low word error rate

Downloads 25

Release Time : 3/2/2022

Model Overview

This model is primarily used for Vietnamese automatic speech recognition tasks, capable of converting Vietnamese speech into text. The model was fine-tuned on the Common Voice Vietnamese dataset, achieving a word error rate (WER) of 25.2%.

Model Features

Vietnamese speech recognition

Speech recognition capabilities optimized specifically for Vietnamese

Based on MT5 architecture

Utilizes the MT5 model architecture for speech recognition tasks

Fine-tuned on Common Voice

Fine-tuned using the Common Voice Vietnamese dataset

Model Capabilities

Vietnamese speech recognition

Speech-to-text

Use Cases

Speech transcription

Vietnamese speech transcription

Convert Vietnamese speech content into text

Word error rate 25.2%

Voice assistants

Vietnamese voice assistant

Used for building Vietnamese voice assistant systems

Property	Details
Language	Vietnamese
Datasets	- common_voice - FOSD: https://data.mendeley.com/datasets/k9sxg2twv4/4
Metrics	wer
Tags	language - modeling, audio, automatic - speech - recognition, speech, xlsr - fine - tuning - week

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Fine Tune XLSR Wav2Vec2 Speech2Text Vietnamese

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 MT5 Fix Asr Vietnamese by Ontocord

📚 Documentation

📋 General Information

📊 Model Index

📄 License