Orpheus-3B-0.1-FT Open Source Text-to-Speech Model - Supports Emotional Control and Voice Cloning

Home

Orpheus 3b 0.1 Ft

Developed by chutesai

High-quality text-to-speech model based on Llama architecture, supporting emotion control and voice cloning

Speech Synthesis

Transformers

EnglishOpen Source License:Apache-2.0 #Realistic Voice Synthesis #Zero-shot Voice Cloning #Emotion-controllable TTS

Downloads 2,686

Release Time : 3/24/2025

Model Overview

Orpheus TTS is a large-scale voice model based on the Llama architecture, achieving human-level voice synthesis through fine-tuning, excelling in clarity, expressiveness, and real-time streaming processing.

Model Features

Realistic Voice

Natural intonation, emotion, and rhythm performance surpass current closed-source state-of-the-art models

Zero-shot Voice Cloning

Clone target voices without pre-training

Controllable Emotion and Tone

Adjust speech emotional features through simple labels

Low-latency Processing

Approximately 200ms streaming latency in real-time scenarios, reducible to 100ms with input streaming

Model Capabilities

High-quality Voice Synthesis

Emotional Voice Generation

Voice Cloning

Streaming Voice Output

Use Cases

Voice Interaction

Virtual Assistants

Provide natural and fluent voice output for virtual assistants

Enhance user experience and interaction naturalness

Audiobooks

Automatically generate expressive audiobooks

Reduce content production costs

Assistive Technology

Voice Assistance

Provide high-quality voice output for visually impaired individuals

Improve the usability of assistive technologies

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Orpheus 3b 0.1 Ft

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Orpheus 3B 0.1 Finetuned

✨ Features

Model Capabilities

Model Sources

💻 Usage Examples

📄 License

🚫 Model Misuse