Reazonspeech-Nemo-v2 Open-Source Japanese Speech Recognition Model - Super Practical with Support for Long Audio Inference

Reazonspeech Nemo V2

Developed by reazon-research

Japanese automatic speech recognition model trained on the ReazonSpeech v2.0 corpus, supporting long audio inference

Downloads 3,897

Release Time : 1/30/2024

Model Overview

This model is an automatic speech recognition system optimized for Japanese, capable of processing continuous speech inputs lasting several hours.

Long audio processing capability

Supports continuous recognition of Japanese long audio segments lasting several hours

Efficient attention mechanism

Utilizes Longformer attention mechanism with local context size of 256, including global tokens

Optimized training

Trained for 1 million steps using AdamW optimizer and Noam annealing schedule

Japanese speech recognition

Long audio processing

Continuous speech transcription

Speech transcription

Automatic meeting minutes generation

Automatically converts long business meeting recordings into text transcripts

Media content subtitle generation

Automatically generates subtitles for Japanese podcasts, videos, and other content

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base