# Short audio processing
Whisper Large V3 Broad Accent
Bsd-3-clause
An English broad accent classification model based on Whisper-Large-v3, capable of recognizing accents from the British Isles, North America, and other three categories of English accents
Audio Classification
Safetensors English
W
tiantiaf
156
1
Gemma 3 4b It Speech
Gemma-3-MM is a multimodal instruction model extended from Gemma-3-4b-it with added speech processing capabilities, capable of handling text, image, and audio inputs to generate text outputs.
Audio-to-Text
Transformers

G
junnei
383
12
Teochew Whisper Medium
MIT
A Teochew (Chaozhou dialect) speech recognition model fine-tuned based on the Whisper medium model, specifically designed for recognizing the Teochew dialect of the Min Nan language family in southern China.
Speech Recognition
Transformers

T
efficient-nlp
194
31
Wavlm Basic S F O 8batch 10sec 0.0001lr Unfrozen
A voice processing model fine-tuned based on microsoft/wavlm-large, achieving 80% accuracy and 79.57% F1 score on the evaluation set
Audio Classification
Transformers

W
reralle
14
0
Wavlm Basic S R 5c 8batch 5sec 0.0001lr Unfrozen
A speech processing model fine-tuned based on microsoft/wavlm-large, achieving 75% accuracy on the evaluation set
Audio Classification
Transformers

W
reralle
16
0
Wavlm Basic N F N 8batch 5sec 0.0001lr Unfrozen
A speech processing model fine-tuned based on microsoft/wavlm-large, achieving an accuracy of 73.33% on the evaluation set
Audio Classification
Transformers

W
reralle
14
0
Featured Recommended AI Models