Whisper Large V2 Mix Jp
An automatic speech recognition (ASR) model fine-tuned on Japanese speech datasets based on OpenAI Whisper-large-v2
Downloads 93
Release Time : 12/19/2022
Model Overview
This model is a Japanese-optimized version of Whisper-large-v2, specifically fine-tuned for Japanese speech recognition tasks, demonstrating excellent performance in Word Error Rate (WER) and Character Error Rate (CER) metrics.
Model Features
Japanese Optimization
Specifically fine-tuned on JSUT, JSSS, CSS10, and Common Voice Japanese datasets to optimize Japanese speech recognition performance
Low Error Rate
Achieves a Word Error Rate (WER) of 7.65% and a Character Error Rate (CER) of 4.72% on test sets
Efficient Training
Utilizes mixed-precision training and gradient accumulation techniques to optimize training efficiency
Model Capabilities
Japanese speech-to-text
High-precision speech recognition
Long audio processing
Use Cases
Speech Transcription
Japanese Meeting Minutes
Automatically convert Japanese meeting recordings into text transcripts
Accuracy approximately 92.35% (based on 1-WER)
Japanese Media Subtitle Generation
Automatically generate subtitles for Japanese video content
Voice Assistants
Japanese Voice Command Recognition
Used for voice command understanding in Japanese voice assistant systems
Featured Recommended AI Models