Kotoba-Whisper-v1.0 Open-source Model - Free Deployment, 6.3 Times Faster Japanese Speech Recognition with Low Error Rate

Kotoba Whisper V1.0

Developed by kotoba-tech

Kotoba-Whisper is a Japanese automatic speech recognition distilled Whisper model collection jointly developed by Asahi Ushio and Kotoba Technologies, which is 6.3 times faster than the original large-v3 while maintaining similar low error rates.

Speech Recognition

Transformers

JapaneseOpen Source License:Apache-2.0 #Japanese speech recognition #Low-latency inference #Distilled model

Downloads 2,397

Release Time : 4/14/2024

Model Overview

Japanese automatic speech recognition model, optimized based on Whisper large-v3 distillation, focusing on Japanese speech transcription tasks.

Model Features

Efficient inference

6.3 times faster than the original Whisper large-v3

High accuracy

CER and WER performance on multiple Japanese test sets is close to or better than the original model

Japanese-focused optimization

Specially trained and optimized for Japanese speech characteristics

Long audio support

Supports sequential and chunked long audio transcription algorithms

Model Capabilities

Japanese speech recognition

Short audio transcription

Long audio transcription

Timestamped transcription

Use Cases

Speech transcription

Japanese meeting minutes

Automatically transcribe Japanese meeting recordings into text

CER 9.4-12.2, WER 56.6-64.3

Japanese podcast subtitle generation

Automatically generate subtitles for Japanese podcast content

Supports long audio transcription and can generate subtitles with timestamps

Speech data annotation

Japanese speech dataset annotation

Used to assist in the annotation of Japanese speech datasets

Can serve as a pre-annotation tool to improve annotation efficiency

Model	CommonVoice 8 (Japanese test set)	JSUT Basic 5000	ReazonSpeech (held out test set)
kotoba-tech/kotoba-whisper-v2.0	9.2	8.4	11.6
kotoba-tech/kotoba-whisper-v1.0	9.4	8.5	12.2
openai/whisper-large-v3	8.5	7.1	14.9
openai/whisper-large-v2	9.7	8.2	28.1
openai/whisper-large	10	8.9	34.1
openai/whisper-medium	11.5	10	33.2
openai/whisper-base	28.6	24.9	70.4
openai/whisper-small	15.1	14.2	41.5
openai/whisper-tiny	53.7	36.5	137.9

Model	CommonVoice 8 (Japanese test set)	JSUT Basic 5000	ReazonSpeech (held out test set)
kotoba-tech/kotoba-whisper-v2.0	58.8	63.7	55.6
kotoba-tech/kotoba-whisper-v1.0	59.2	64.3	56.4
openai/whisper-large-v3	55.1	59.2	60.2
openai/whisper-large-v2	59.3	63.2	74.1
openai/whisper-large	61.1	66.4	74.9
openai/whisper-medium	63.4	69.5	...

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Kotoba Whisper V1.0

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Kotoba-Whisper (v1.0)

🚀 Quick Start

✨ Features

📚 Documentation

Model Architecture

Training Data

Evaluation

CER Comparison

WER Comparison

📄 License