🚀 ⓍTTS
ⓍTTS is a voice generation model that enables voice cloning across different languages using just a 6 - second audio clip. Built on Tortoise, it features significant model improvements, making cross - language voice cloning and multilingual speech generation extremely easy. There's no need for an excessive amount of training data spanning countless hours. This model powers Coqui Studio and Coqui API, with optimizations for faster performance and streaming inference.
🚀 Quick Start
The current implementation supports inference and fine - tuning.
✨ Features
- Supports 14 languages.
- Voice cloning with just a 6 - second audio clip.
- Emotion and style transfer by cloning.
- Cross - language voice cloning.
- Multi - lingual speech generation.
- 24khz sampling rate.
💻 Usage Examples
Basic Usage
from TTS.api import TTS
tts = TTS("tts_models/multilingual/multi-dataset/xtts_v1", gpu=True)
tts.tts_to_file(text="It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.",
file_path="output.wav",
speaker_wav="/path/to/target/speaker.wav",
language="en")
tts.tts_to_file(text="It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.",
file_path="output.wav",
speaker_wav="/path/to/target/speaker.wav",
language="en",
decoder_iterations=30)
Advanced Usage
from TTS.tts.configs.xtts_config import XttsConfig
from TTS.tts.models.xtts import Xtts
config = XttsConfig()
config.load_json("/path/to/xtts/config.json")
model = Xtts.init_from_config(config)
model.load_checkpoint(config, checkpoint_dir="/path/to/xtts/", eval=True)
model.cuda()
outputs = model.synthesize(
"It took me quite a long time to develop a voice and now that I have it I am not going to be silent.",
config,
speaker_wav="/data/TTS-public/_refclips/3.wav",
gpt_cond_len=3,
language="en",
)
📚 Documentation
Languages
As of now, XTTS - v1 (v1.1) supports 14 languages: English, Spanish, French, German, Italian, Portuguese, Polish, Turkish, Russian, Dutch, Czech, Arabic, Chinese, and Japanese.
Stay tuned as we continue to add support for more languages. If you have any language requests, please feel free to reach out!
Using 🐸TTS Command line
tts --model_name tts_models/multilingual/multi-dataset/xtts_v1 \
--text "Bugün okula gitmek istemiyorum." \
--speaker_wav /path/to/target/speaker.wav \
--language_idx tr \
--use_cuda true
📄 License
This model is licensed under Coqui Public Model License. There's a lot that goes into a license for generative models, and you can read more of the origin story of CPML here.
📞 Contact
Come and join in our 🐸Community. We're active on Discord and Twitter. You can also mail us at info@coqui.ai.
⚠️ Important Note
ⓍTTS V2 model is out here XTTS V2