Wav2vec2 Large Xlsr 53 Th Cv8 Newmm
This model is a Thai automatic speech recognition model trained on the CommonVoice V8 dataset, using the wav2vec2-large-xlsr-53 architecture with the newmm tokenizer and integrated language model, significantly improving Thai speech recognition accuracy.
Downloads 6,486
Release Time : 6/6/2022
Model Overview
This model is specifically optimized for Thai speech recognition tasks, combining the CommonVoice V8 dataset and a language model to achieve outstanding performance in Word Error Rate (WER) and Character Error Rate (CER).
Model Features
Improved Dataset
Uses the CommonVoice V8 dataset, which has a larger volume and better training results compared to the V7 version.
Optimized Tokenization
Employs the newmm tokenizer for pre-tokenization, optimized for Thai language characteristics.
Language Model Integration
Incorporates a language model to further enhance recognition accuracy.
Multi-Metric Evaluation
Evaluates both Word Error Rate (WER) and Character Error Rate (CER) to comprehensively measure model performance.
Model Capabilities
Thai Speech Recognition
Speech-to-Text
Multi-Metric Performance Evaluation
Use Cases
Speech Transcription
Thai Speech Transcription
Converts Thai speech content into text
Achieved 12.58% WER (newmm tokenizer) on the CommonVoice V8 test set.
Voice Assistants
Thai Voice Command Recognition
Used for Thai voice assistants or smart device command recognition
Featured Recommended AI Models