L

Llasa 3B

Developed by unsloth
Llasa is a text-to-speech (TTS) system based on LLaMA, which extends the capabilities of the language model by integrating speech tokens, supporting Chinese and English speech generation.
Downloads 55
Release Time : 5/15/2025

Model Overview

Llasa is a text-to-speech (TTS) system that extends the text-based LLaMA language model by integrating 65,536 speech tokens from the XCodec2 codebook. The model can generate speech solely from input text or by utilizing given voice prompts.

Model Features

Training and Inference Computational Extension
Supports extended computation during both training and inference phases to enhance model performance.
Multilingual Support
Supports speech generation in both Chinese and English.
Voice Prompt Generation
Capable of generating speech using given voice prompts.
Efficient Training
Training TTS is similar to training LLM, leveraging existing compression, acceleration, and fine-tuning methods for LLMs.

Model Capabilities

Text-to-Speech
Voice Prompt Generation
Chinese-English Speech Synthesis

Use Cases

Speech Synthesis
Voice Assistants
Generating natural speech for virtual assistants.
Produces high-quality speech output.
Audiobooks
Converting text content into speech.
Generates natural and fluent speech.
Voice Prompt Applications
Voice Style Transfer
Generating speech with a similar style based on given voice prompts.
Maintains consistency in voice style.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase