L

Llama OuteTTS 1.0 1B GPTQ 8bit

Developed by adriabama06
OuteTTS 1.0 is a 1B-parameter text-to-speech model supporting multilingual speech synthesis and voice cloning
Downloads 15
Release Time : 4/7/2025

Model Overview

A speech synthesis model based on the Llama3.2 architecture, achieving high-fidelity audio reconstruction through DAC encoder, supporting text-to-speech and voice cloning in 17 major languages

Model Features

Native multilingual support
Directly supports text input in 23 languages without preprocessing like romanization conversion
Efficient voice cloning
Generates precise voiceprint clones with just 10 seconds of reference audio
Intelligent text alignment
Automatically handles word alignment for languages without clear boundaries (e.g., Japanese/Chinese)
DAC audio encoder
Utilizes IBM Research's high-fidelity dual-codebook architecture for significantly improved audio quality

Model Capabilities

Text-to-speech synthesis
Cross-language voice conversion
Voice feature cloning
Emotional speech generation
Long-form speech synthesis (up to 42 seconds)

Use Cases

Assistive technology
Accessible reading
Converts text content into speech for visually impaired users
Supports natural speech output in multiple languages
Content creation
Audio content production
Quickly generates podcasts/video voiceovers
Can clone specific host voices
Educational technology
Language learning tool
Generates multilingual pronunciation examples
Supports native pronunciation in 23 languages
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase