K

Kotoba Whisper V2.0

Developed by kotoba-tech
Kotoba-Whisper is a Japanese automatic speech recognition distilled model developed by Asahi Ushio in collaboration with Kotoba Technologies, based on Whisper large-v3 distillation, achieving a 6.3x inference speed improvement.
Downloads 8,108
Release Time : 9/17/2024

Model Overview

Japanese automatic speech recognition model optimized through knowledge distillation technology from Whisper large-v3, significantly improving inference speed while maintaining comparable error rates.

Model Features

Efficient inference
6.3x faster inference speed compared to the original Whisper large-v3
High performance
Superior CER/WER on Japanese datasets like ReazonSpeech compared to the original model
Large-scale training
Trained on over 7.2 million Japanese speech-text pairs

Model Capabilities

Japanese speech-to-text
Long audio segmentation processing
Supports Flash Attention 2 acceleration

Use Cases

Speech transcription
TV program subtitle generation
Process Japanese TV program audio to generate accurate subtitles
CER 11.6/WER 55.6 on ReazonSpeech test set
Voice assistant
Provides fast and accurate speech recognition for Japanese voice assistants
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase