W

Wav2vec2 Large Xlsr 53 Hk

Developed by voidful
A speech recognition model fine-tuned on Cantonese (using the Common Voice dataset) based on facebook/wav2vec2-large-xlsr-53
Downloads 26
Release Time : 3/2/2022

Model Overview

This is an automatic speech recognition model optimized for Cantonese (Hong Kong), based on the Wav2Vec2 architecture, suitable for converting Cantonese speech to text.

Model Features

Cantonese Optimization
Specially fine-tuned for the Cantonese (Hong Kong) dialect to improve recognition accuracy
Based on XLSR Model
Built on the powerful wav2vec2-large-xlsr-53 foundation, with excellent speech feature extraction capabilities
16kHz Sampling Rate Support
Optimized for processing speech input at 16kHz sampling rate

Model Capabilities

Cantonese Speech Recognition
Speech-to-Text
Audio Content Transcription

Use Cases

Speech Transcription
Cantonese Meeting Minutes
Automatically convert Cantonese meeting recordings into text transcripts
CER 16.41
Media Content Subtitle Generation
Automatically generate subtitles for Cantonese video content
Voice Assistants
Cantonese Voice Command Recognition
Used for supporting Cantonese voice control in smart devices
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase