WavLM-VLSP-vi Open-source Vietnamese Automatic Speech Recognition Model

Wavlm VLSP Vi

Developed by phongdtd

A Vietnamese automatic speech recognition model fine-tuned on the PHONGDTD/VINDATAVLSP - NA dataset based on microsoft/wavlm-base-plus

Downloads 21

Release Time : 3/2/2022

Model Overview

This model is optimized for Vietnamese automatic speech recognition (ASR) tasks, fine-tuned based on the WavLM architecture

Vietnamese optimization

Specifically fine-tuned for Vietnamese speech recognition tasks

Based on WavLM architecture

Uses Microsoft's WavLM-base-plus as the base model, with powerful speech representation capabilities

Multi-GPU training

Utilizes distributed multi-GPU training to improve training efficiency

Vietnamese speech-to-text

Continuous speech recognition

Speech transcription

Vietnamese meeting minutes

Convert Vietnamese meeting recordings into text transcripts

Voice assistant

Provide speech recognition capabilities for Vietnamese voice assistants

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
3.4482	9.41	40000	3.4480	0.9999	0.9974
3.4619	18.81	80000	3.4514	0.9999	0.9974
3.7961	28.22	120000	3.8732	0.9999	0.9974
24.3843	37.62	160000	22.5457	0.9999	0.9973
48.5691	47.03	200000	45.8892	0.9999	0.9973

Property	Details
Model Type	Fine - tuned version of microsoft/wavlm-base-plus on the PHONGDTD/VINDATAVLSP - NA dataset
Training Data	More information needed

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base