W

Wav2vec2 Large Xlsr Cantonese

Developed by ctl
A Cantonese speech recognition model fine-tuned based on Facebook's wav2vec2-large-xlsr-53 model, supporting 16kHz sampled audio input.
Downloads 1,010
Release Time : 3/2/2022

Model Overview

This is an Automatic Speech Recognition (ASR) model optimized for Cantonese, based on Facebook's wav2vec2-large-xlsr-53 architecture and fine-tuned using the Common Voice Cantonese dataset.

Model Features

Cantonese Optimization
Specifically fine-tuned for Cantonese speech characteristics to improve recognition accuracy
No Language Model Required
Can be used directly without additional language model support
16kHz Sampling Rate Support
Supports standard 16kHz sampled audio input

Model Capabilities

Cantonese Speech Recognition
Automatic Speech-to-Text

Use Cases

Speech Transcription
Cantonese Speech-to-Text
Convert Cantonese speech content into text
Test CER is 15.36%
Voice Assistant
Cantonese Voice Interaction
Provide voice interaction capability for Cantonese users
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase