slim-orpheus-3b-JAPANESE-ft Open-source Text-to-Speech Model - Optimized for Japanese, Quick Synthesis with 14 Voices

Slim Orpheus 3b JAPANESE Ft

Developed by Gapeleon

A lightweight text-to-speech model optimized for Japanese, improving inference speed through layer pruning, supporting 14 voice tones

Speech Synthesis

Transformers

JapaneseOpen Source License:Apache-2.0 #Japanese TTS #Lightweight Speech Synthesis #Multi-Voice Support

Downloads 26

Release Time : 4/10/2025

Model Overview

A Japanese speech synthesis model based on the Orpheus-TTS architecture, achieving efficient inference by reducing 43% of layers while maintaining high-quality speech generation

Model Features

Efficient Inference

Reduced from 28 layers to 16 layers (43% reduction), significantly lowering memory requirements and improving inference speed

Multi-Voice Support

Offers 14 different voice tones, including 8 high-quality voices (⭐⭐⭐ rating)

Japanese Optimization

Specifically trained and optimized for Japanese speech characteristics

Real-Time Processing

Inherits the low-latency characteristics of the original Orpheus model, suitable for streaming processing

Model Capabilities

Japanese Text-to-Speech

Multi-Voice Speech Synthesis

Real-Time Speech Generation

Use Cases

Entertainment Applications

Game Character Voice Acting

Generates real-time voices for Japanese game characters

Provides 14 different character voice options

Audio Content Creation

Automatically generates Japanese podcast or audiobook content

Supports switching between different narrator voices

Assistive Technology

Voice Assistants

Provides natural speech output for Japanese voice assistants

Low latency suitable for interactive scenarios

🚀 Slim-Orpheus 3B Japanese

Slim-Orpheus 3B Japanese is a text - to - speech model. It prunes the original weights to speed up inference and reduce memory requirements, and is trained on 14 Japanese voices.

✨ Features

Pruned original weights down from 28 -> 16 layers (43% reduction) to speed up inference and reduce memory requirements.
Trained in Japanese on 14 voices.

💻 Voices

Below are sample outputs for each voice with quality indicators:

⭐⭐⭐ Good quality
⭐⭐ Okay quality
⭐ Poor quality
⚠️ Unstable

Lyney ⭐⭐⭐

Cyno ⭐⭐⭐

Tighnari ⭐⭐⭐

Kaeya ⭐⭐⭐

Neuvillette ⭐⭐⭐

Kaveh ⭐⭐⭐

Dehya ⭐⭐⭐

Yae Miko ⭐⭐⭐

Layla ⭐⭐

Yoimiya ⭐⭐

Alhaitham ⭐⭐

Zhongli ⭐⭐

Furina ⭐

Arataki Itto ⚠️

🔧 Technical Details

Limitations

Japanese Only: This model was trained specifically for Japanese language and cannot speak English or other languages
No Emote Support: Not trained on emotes/emotional cues like , , etc. that were available in the original model
Reduced Parameter Count: While offering faster inference, the reduction from 28 to 16 layers may impact some of the nuanced capabilities of the original Orpheus model
Voice Quality Varies: As noted in the voice quality ratings, some voices perform better than others

Orpheus - TTS Model Details

Code is available on GitHub: [CanopyAI/Orpheus - TTS](https://github.com/canopyai/Orpheus - TTS)

Orpheus TTS is a state - of - the - art, Llama - based Speech - LLM designed for high - quality, empathetic text - to - speech generation. This model has been finetuned to deliver human - level speech synthesis, achieving exceptional clarity, expressiveness, and real - time streaming performances.

Model Capabilities

Human - Like Speech: Natural intonation, emotion, and rhythm that is superior to SOTA closed source models
Low Latency: ~200ms streaming latency for realtime applications, reducible to ~100ms with input streaming

Original Model Sources

GitHub Repo: [https://github.com/canopyai/Orpheus - TTS](https://github.com/canopyai/Orpheus - TTS)

🚀 Usage

Check out the Orpheus Colab: ([link to Colab](https://colab.research.google.com/drive/1KhXT56UePPUHhqitJNUxq63k - pQomz3N?usp=sharing)) or GitHub ([link to GitHub](https://github.com/canopyai/Orpheus - TTS)) on how to run easy inference on our finetuned models.

📄 License

The model is under the apache - 2.0 license.

📚 Documentation

Model Misuse

Do not use our models for impersonation without consent, misinformation or deception (including fake news or fraudulent calls), or any illegal or harmful activity. By using this model, you agree to follow all applicable laws and ethical guidelines. We disclaim responsibility for any use.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご