L

Llama OuteTTS 1.0 1B

Developed by unsloth
OuteTTS 1.0 is a multilingual text-to-speech model based on the Llama architecture, supporting 20 languages with high-quality speech synthesis and voice cloning capabilities.
Downloads 233
Release Time : 5/15/2025

Model Overview

This is a 1B-parameter text-to-speech model that utilizes the DAC audio encoder for high-quality speech synthesis, supporting one-shot voice cloning and automatic text alignment.

Model Features

Multilingual support
Supports text-to-speech in 23 languages, including major European and Asian languages
High-quality speech synthesis
Uses DAC audio encoder for high-fidelity speech output
One-shot voice cloning
Generates accurate voice representations with only about 10 seconds of reference audio
Automatic text alignment
Automatically handles word alignment without requiring text preprocessing
Efficient inference
Runs 1.5x faster with 58% reduced memory usage under the Unsloth framework

Model Capabilities

Text-to-speech
Voice cloning
Multilingual synthesis
Automatic text alignment
High-quality audio generation

Use Cases

Speech synthesis
Audiobook generation
Converts text content into natural speech
High-quality, natural-sounding speech output
Voice assistants
Provides multilingual voice support for virtual assistants
Supports voice interaction in 23 languages
Voice cloning
Personalized speech synthesis
Clones a specific speaker's voice based on a small sample
Generates similar speech with only 10 seconds of audio
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase