W

Whisper Finetune Teochew

Developed by panlr
A Teochew (Chaoshan) orthographic recognition model fine-tuned based on Whisper-medium, supporting multi-dialect accent orthographic transcription
Downloads 20
Release Time : 3/17/2025

Model Overview

This model is specifically designed for automatic speech recognition of Teochew (Chaoshan) dialect, using an innovative Dai Kan orthography annotation to avoid homophone ambiguity issues.

Model Features

Multi-dialect support
Covers various accents including Teochew prefectural city, Shantou urban area, southern Chao'an, Chenghai, and Rongjiang pronunciations
Dai Kan orthography
Employs an innovative annotation scheme to resolve homophone ambiguity (e.g., using ใ€ไป‹ใ€‘ instead of easily confused ใ€ไธชใ€‘)
Field recording data
Trained on 18.9 hours of real-world recordings containing 12,500 annotated samples

Model Capabilities

Teochew speech-to-text
Multi-accent recognition
Orthographic transcription

Use Cases

Dialect preservation
Teochew speech archiving
Converting orally transmitted Teochew recordings into standardized written records
CER 12.254% (test set)
Voice interaction
Dialect voice assistant
Supporting Teochew voice input for smart device interaction
Featured Recommended AI Models
ยฉ 2025AIbase