W

Wav2vec2 Large Xlsr 53 Tw Gpt

Developed by voidful
A speech recognition model fine-tuned on Taiwan Mandarin (zh-tw) based on facebook/wav2vec2-large-xlsr-53, supporting 16kHz sampling rate audio input
Downloads 47
Release Time : 3/2/2022

Model Overview

This is an automatic speech recognition (ASR) model optimized for Taiwan Mandarin, fine-tuned from Facebook's wav2vec2-large-xlsr-53 architecture and trained on the Common Voice zh-TW dataset

Model Features

Taiwan Mandarin Optimization
Specifically fine-tuned for the phonetic characteristics of Taiwan Mandarin
Language Model Fusion Support
Can be combined with language models like GPT or BERT to improve recognition accuracy
Efficient Inference
Achieves a CER of 18.36% on the Common Voice test set with relatively fast inference speed

Model Capabilities

Taiwan Mandarin speech recognition
Supports 16kHz sampling rate audio processing
Can be combined with language models

Use Cases

Speech to Text
Taiwan Mandarin Speech Transcription
Convert Taiwan Mandarin speech content into text
CER 18.36% (evaluated using GPT+beam search)
Voice Assistant
Taiwan Region Voice Command Recognition
Used to recognize voice commands in Taiwan Mandarin
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase