W

Wav2vec2 Large Xlsr 53 Th Cv8 Newmm

Developed by wannaphong
This model is a Thai automatic speech recognition model trained on the CommonVoice V8 dataset, using the wav2vec2-large-xlsr-53 architecture with the newmm tokenizer and integrated language model, significantly improving Thai speech recognition accuracy.
Downloads 6,486
Release Time : 6/6/2022

Model Overview

This model is specifically optimized for Thai speech recognition tasks, combining the CommonVoice V8 dataset and a language model to achieve outstanding performance in Word Error Rate (WER) and Character Error Rate (CER).

Model Features

Improved Dataset
Uses the CommonVoice V8 dataset, which has a larger volume and better training results compared to the V7 version.
Optimized Tokenization
Employs the newmm tokenizer for pre-tokenization, optimized for Thai language characteristics.
Language Model Integration
Incorporates a language model to further enhance recognition accuracy.
Multi-Metric Evaluation
Evaluates both Word Error Rate (WER) and Character Error Rate (CER) to comprehensively measure model performance.

Model Capabilities

Thai Speech Recognition
Speech-to-Text
Multi-Metric Performance Evaluation

Use Cases

Speech Transcription
Thai Speech Transcription
Converts Thai speech content into text
Achieved 12.58% WER (newmm tokenizer) on the CommonVoice V8 test set.
Voice Assistants
Thai Voice Command Recognition
Used for Thai voice assistants or smart device command recognition
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase