W

Wav2vec2 Large Xlsr 53 Japanese

Developed by jonatasgrosman
Japanese speech recognition model fine-tuned from facebook/wav2vec2-large-xlsr-53, supporting 16kHz sampling rate audio input
Downloads 2.9M
Release Time : 3/2/2022

Model Overview

This is a fine-tuned XLSR-53 large model for Japanese speech recognition tasks, trained on Common Voice 6.1, CSS10, and JSUT datasets, suitable for Japanese speech-to-text tasks.

Model Features

Multi-dataset Training
Combined training on three Japanese datasets: Common Voice 6.1, CSS10, and JSUT, improving model generalization
No Language Model Required
Can be used directly for speech recognition without additional language model support
16kHz Sampling Rate Support
Optimized for 16kHz sampling rate audio input

Model Capabilities

Japanese Speech Recognition
Audio-to-Text Conversion
Automatic Speech Transcription

Use Cases

Speech Transcription
Japanese Speech-to-Text
Convert Japanese speech content into text format
CER 20.16%, WER 81.80% (on Common Voice Japanese test set)
Voice Assistants
Japanese Voice Command Recognition
Used for voice command recognition in Japanese voice assistants or control systems
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase