P

Phi 4 Multimodal Instruct Commonvoice Zh Tw

Developed by JacobLinCool
A Taiwanese Mandarin speech recognition model fine-tuned from microsoft/Phi-4-multimodal-instruct, trained on the Taiwanese Mandarin General Voice 19.0 dataset
Downloads 28
Release Time : 3/13/2025

Model Overview

An automatic speech recognition model optimized for Taiwanese Mandarin (zh-TW), capable of converting Taiwanese Mandarin speech into Traditional Chinese text

Model Features

Taiwanese Mandarin Optimization
Specifically optimized for Taiwanese Mandarin speech patterns and vocabulary
Multimodal Capabilities
Based on a multimodal foundation model with the ability to process audio input
Efficient Fine-tuning
Uses LoRA adapters for efficient fine-tuning, preserving the base model's capabilities while optimizing speech recognition performance

Model Capabilities

Taiwanese Mandarin speech recognition
Audio-to-text conversion
Automatic subtitle generation

Use Cases

Speech-to-text
Meeting Minutes
Convert Taiwanese Mandarin meeting recordings into text transcripts
CER 6.67%, WER 31.18%
Content Subtitles
Generate automatic subtitles for Taiwanese Mandarin video content
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase