Kotoba Whisper V1.0
Kotoba-Whisper is a Japanese automatic speech recognition distilled Whisper model collection jointly developed by Asahi Ushio and Kotoba Technologies, which is 6.3 times faster than the original large-v3 while maintaining similar low error rates.
Downloads 2,397
Release Time : 4/14/2024
Model Overview
Japanese automatic speech recognition model, optimized based on Whisper large-v3 distillation, focusing on Japanese speech transcription tasks.
Model Features
Efficient inference
6.3 times faster than the original Whisper large-v3
High accuracy
CER and WER performance on multiple Japanese test sets is close to or better than the original model
Japanese-focused optimization
Specially trained and optimized for Japanese speech characteristics
Long audio support
Supports sequential and chunked long audio transcription algorithms
Model Capabilities
Japanese speech recognition
Short audio transcription
Long audio transcription
Timestamped transcription
Use Cases
Speech transcription
Japanese meeting minutes
Automatically transcribe Japanese meeting recordings into text
CER 9.4-12.2, WER 56.6-64.3
Japanese podcast subtitle generation
Automatically generate subtitles for Japanese podcast content
Supports long audio transcription and can generate subtitles with timestamps
Speech data annotation
Japanese speech dataset annotation
Used to assist in the annotation of Japanese speech datasets
Can serve as a pre-annotation tool to improve annotation efficiency
Featured Recommended AI Models
Š 2025AIbase