K

Kotoba Whisper V1.0

Developed by kotoba-tech
Kotoba-Whisper is a Japanese automatic speech recognition distilled Whisper model collection jointly developed by Asahi Ushio and Kotoba Technologies, which is 6.3 times faster than the original large-v3 while maintaining similar low error rates.
Downloads 2,397
Release Time : 4/14/2024

Model Overview

Japanese automatic speech recognition model, optimized based on Whisper large-v3 distillation, focusing on Japanese speech transcription tasks.

Model Features

Efficient inference
6.3 times faster than the original Whisper large-v3
High accuracy
CER and WER performance on multiple Japanese test sets is close to or better than the original model
Japanese-focused optimization
Specially trained and optimized for Japanese speech characteristics
Long audio support
Supports sequential and chunked long audio transcription algorithms

Model Capabilities

Japanese speech recognition
Short audio transcription
Long audio transcription
Timestamped transcription

Use Cases

Speech transcription
Japanese meeting minutes
Automatically transcribe Japanese meeting recordings into text
CER 9.4-12.2, WER 56.6-64.3
Japanese podcast subtitle generation
Automatically generate subtitles for Japanese podcast content
Supports long audio transcription and can generate subtitles with timestamps
Speech data annotation
Japanese speech dataset annotation
Used to assist in the annotation of Japanese speech datasets
Can serve as a pre-annotation tool to improve annotation efficiency
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase