X

XTTS V2

Developed by coqui
ⓍTTS is a revolutionary voice generation model that achieves cross-lingual voice cloning with just a 6-second audio clip, supporting 17 languages.
Downloads 1.7M
Release Time : 10/31/2023

Model Overview

ⓍTTS is an advanced speech synthesis model capable of cloning voices from extremely short audio samples and supports multilingual speech synthesis with emotion and style transfer.

Model Features

Rapid Voice Cloning
Clones target voice with just 6 seconds of audio
Cross-lingual Support
Supports speech synthesis in 17 languages
Emotion and Style Transfer
Achieves emotion and style conversion through cloning
High-Quality Output
Delivers high-fidelity audio with 24kHz sampling rate
Multi-Reference Voice Fusion
Supports blending features from multiple reference voices

Model Capabilities

Text-to-Speech
Voice Cloning
Cross-lingual Speech Synthesis
Emotion and Style Transfer
Multilingual Support

Use Cases

Speech Synthesis
Personalized Voice Assistants
Create personalized voices for voice assistants
Achieves natural and personalized voice interaction
Multilingual Audio Content Creation
Generate speech content in different languages using the same voice
Simplifies multilingual content production
Entertainment Applications
Game Character Voiceovers
Quickly generate personalized voices for game characters
Reduces game voiceover costs
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase