๐ MarmaSpeakTTS: Text-to-Speech Model for Marma Language
MarmaSpeakTTS is a text-to-speech model designed to synthesize speech in the Marma language, serving the Marma people in Bangladesh and Myanmar.
๐ Quick Start
This model can be used with the ๐ค Transformers library:
from transformers import VitsModel, AutoTokenizer, pipeline
import scipy.io.wavfile
model = VitsModel.from_pretrained("CLEAR-Global/marmaspeak-tts-v1")
tokenizer = AutoTokenizer.from_pretrained("CLEAR-Global/marmaspeak-tts-v1")
synthesizer = pipeline("text-to-speech", model=model, tokenizer=tokenizer)
text = "แแญแฏแแฑแฌแบ แแฌแแฌ แแฎแแฑแแแบแธแ"
output = synthesizer(text)
scipy.io.wavfile.write("output.wav", rate=16000, data=output["audio"][0])
โจ Features
- Language Support: Provides text-to-speech synthesis for the Marma language (ISO code: rmz), a Tibeto - Burman language.
๐ฆ Installation
The README does not provide specific installation steps, so this section is skipped.
๐ป Usage Examples
Basic Usage
from transformers import VitsModel, AutoTokenizer, pipeline
import scipy.io.wavfile
model = VitsModel.from_pretrained("CLEAR-Global/marmaspeak-tts-v1")
tokenizer = AutoTokenizer.from_pretrained("CLEAR-Global/marmaspeak-tts-v1")
synthesizer = pipeline("text-to-speech", model=model, tokenizer=tokenizer)
text = "แแญแฏแแฑแฌแบ แแฌแแฌ แแฎแแฑแแแบแธแ"
output = synthesizer(text)
scipy.io.wavfile.write("output.wav", rate=16000, data=output["audio"][0])
๐ Documentation
Model Details
Property |
Details |
Base model |
Massively Multilingual Speech (MMS) |
Model Type |
Text-to-Speech |
Language |
Marma (rmz) |
Training Data |
The model was trained on Marma language audio recordings collected by CLEAR Global. |
Training script |
https://github.com/translatorswb/finetune-hf-vits-marma |
License |
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International |
Limitations and Biases
- This is an early version of the model and may have limitations in pronunciation and naturalness.
- The model works best with properly normalized Marma text.
- Performance may vary based on the complexity and length of the input text.
Training
The model was fine-tuned from a Massively Multilingual Speech (MMS) VITS model using this training recipe.
Ethical Considerations
This model has been developed with permission and input from Marma language speakers. The voice synthesis should be used responsibly and respectfully.
๐ License
The model is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.
๐ Citation
@misc{marma-tts,
author = {CLEAR Global},
title = {MarmaSpeakTTS: A Text-to-Speech Model for Marma Language},
year = {2025},
howpublished = {https://huggingface.co/CLEAR-Global/marmaspeak-tts-v1}
}