Kotoba Whisper V2.0
Kotoba-Whisper is a Japanese automatic speech recognition distilled model developed by Asahi Ushio in collaboration with Kotoba Technologies, based on Whisper large-v3 distillation, achieving a 6.3x inference speed improvement.
Downloads 8,108
Release Time : 9/17/2024
Model Overview
Japanese automatic speech recognition model optimized through knowledge distillation technology from Whisper large-v3, significantly improving inference speed while maintaining comparable error rates.
Model Features
Efficient inference
6.3x faster inference speed compared to the original Whisper large-v3
High performance
Superior CER/WER on Japanese datasets like ReazonSpeech compared to the original model
Large-scale training
Trained on over 7.2 million Japanese speech-text pairs
Model Capabilities
Japanese speech-to-text
Long audio segmentation processing
Supports Flash Attention 2 acceleration
Use Cases
Speech transcription
TV program subtitle generation
Process Japanese TV program audio to generate accurate subtitles
CER 11.6/WER 55.6 on ReazonSpeech test set
Voice assistant
Provides fast and accurate speech recognition for Japanese voice assistants
Featured Recommended AI Models