W

Wav2vec2 Large Xlsr 53 Cantonese

Developed by CAiRE
A Cantonese fine-tuned speech recognition model based on facebook/wav2vec2-large-xlsr-53 using the Common Voice corpus version 8.0
Downloads 1,214
Release Time : 4/9/2022

Model Overview

This is an automatic speech recognition (ASR) model specifically optimized for Cantonese, based on the Wav2Vec2 architecture, suitable for Cantonese speech-to-text tasks.

Model Features

Cantonese Optimization
Specially fine-tuned for Cantonese speech, providing more accurate Cantonese recognition capabilities
Based on Wav2Vec2 Architecture
Utilizes the advanced Wav2Vec2-Large-XLSR-53 architecture with powerful speech feature extraction capabilities
No Language Model Required
Can be used directly without additional language model support

Model Capabilities

Cantonese speech recognition
Audio-to-text conversion
Automatic speech transcription

Use Cases

Speech Transcription
Cantonese Meeting Minutes
Automatically transcribe Cantonese meeting recordings into text records
Character error rate 18.55%
Cantonese Media Subtitle Generation
Automatically generate subtitles for Cantonese video content
Voice Assistants
Cantonese Voice Command Recognition
Used for Cantonese voice assistant command recognition systems
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase