🚀 XTTS v2 Fine-Tuned on Hindi Datasets
This is a fine-tuned version of the XTTS v2 model, specifically optimized for generating natural and accurate Hindi speech. It supports features like voice cloning and multilingual speech generation.
🚀 Quick Start
You can view the Colab notebook used for fine-tuning the XTTS v2 model on Hindi datasets and replicate the process by following this Colab Notebook Link.
✨ Features
- Languages: Supports 16 languages including Hindi (hi).
- Voice Cloning: Clone voices with just a 6 - second audio clip.
- Emotion and Style Transfer: Achieve emotion and style transfer by cloning.
- Cross - Language Voice Cloning: Supports voice cloning across different languages.
- Sampling Rate: 24kHz sampling rate for high - quality audio.
🔧 Technical Details
Updates over XTTS - v1
- New Languages: Added support for Hungarian and Korean.
- Architectural Improvements: Enhanced speaker conditioning and interpolation.
- Stability Improvements: Better overall stability and performance.
- Audio Quality: Improved prosody and audio quality.
Languages
The XTTS - v2 model supports 17 languages including:
- English (en)
- Spanish (es)
- French (fr)
- German (de)
- Italian (it)
- Portuguese (pt)
- Polish (pl)
- Turkish (tr)
- Russian (ru)
- Dutch (nl)
- Czech (cs)
- Arabic (ar)
- Chinese (zh - cn)
- Japanese (ja)
- Hungarian (hu)
- Korean (ko)
- Hindi (hi)
Training Data
The model was fine - tuned on the following Hindi datasets:
- Mozilla CommonVoice 18: A diverse dataset of Hindi speech.
- IndicTTS Hindi Dataset: Hindi speech data for text - to - speech training.
💻 Usage Examples
Basic Usage
from TTS.api import TTS
tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2", gpu=True)
tts.tts_to_file(
text="It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.",
file_path="output.wav",
speaker_wav="/path/to/target/speaker.wav",
language="hi"
)
📚 Documentation
The [code - base](https://github.com/coqui - ai/TTS) supports both inference and fine - tuning.
Demo Spaces
- XTTS Space: Explore the model's performance on supported languages and try it with your own reference or microphone input.
- [XTTS Voice Chat with Mistral or Zephyr](https://huggingface.co/spaces/coqui/voice - chat - with - mistral): Experience streaming voice chat with Mistral 7B Instruct or Zephyr 7B Beta.
📄 License
This model is licensed under the Coqui Public Model License. Read more about the origin story of CPML here.
📞 Contact
Join our 🐸 Community on Discord and follow us on Twitter. For inquiries, you can also email us at info@coqui.ai.
Information Table
Property |
Details |
Model Type |
XTTS v2 Fine - Tuned on Hindi Datasets |
Training Data |
Mozilla CommonVoice 18, IndicTTS Hindi Dataset |
Library Name |
coqui |
Pipeline Tag |
text - to - speech |
License |
Coqui Public Model License |
License Link |
https://coqui.ai/cpml |