W

Wav2vec2 Xls R 300m Zh HK V2

Developed by w11wo
Cantonese automatic speech recognition model based on XLS-R architecture, fine-tuned on Common Voice Cantonese dataset
Downloads 23
Release Time : 3/2/2022

Model Overview

This is an automatic speech recognition model optimized for Cantonese (zh-HK), fine-tuned based on Facebook's Wav2Vec2-XLS-R-300M architecture, suitable for Cantonese speech-to-text tasks.

Model Features

Cantonese Optimization
Specially optimized for Cantonese speech recognition
Large-scale Pretraining
Based on 300M parameter XLS-R architecture with powerful speech feature extraction capabilities
Multi-dataset Validation
Evaluated on multiple datasets including Common Voice and Robust Speech Challenge

Model Capabilities

Cantonese speech recognition
Speech-to-text
Automatic speech recognition

Use Cases

Speech Transcription
Cantonese Speech Transcription
Convert Cantonese speech content into text
CER of 23.02% on Common Voice 8 test set
Voice Assistants
Cantonese Voice Command Recognition
For voice command recognition in Cantonese voice assistants or smart home devices
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase