🚀 Orpheus Bangla GGUF (16 bit)
This is a proof - of - concept fine - tuned version of the Orpheus 3B TTS model for Bengali language support.
🚀 Quick Start
This model is designed to generate Bengali speech from text. It's suitable for experimenting with TTS systems for Bengali in various scenarios like audiobooks, conversational AI, or speech synthesis tasks.
✨ Features
- It's a fine - tuned version of the Orpheus 3B TTS model for Bengali language support.
- Trained on the
SUST - CSE - Speech/banspeech
dataset which contains 955 audio samples from audiobooks.
📦 Installation
No installation steps are provided in the original document.
💻 Usage Examples
No code examples are provided in the original document.
📚 Documentation
Model Description
This model is a proof - of - concept fine - tuned version of the Orpheus 3B TTS (Text - to - Speech) model for Bengali language support. The model has been trained using the SUST - CSE - Speech/banspeech
dataset, which contains 955 audio samples split from audiobooks. This fine - tuning was performed for 10 epochs on a single Google Colab instance equipped with a T4 GPU.
⚠️ Important Note
This model is currently in the proof - of - concept phase and is not recommended for production use.
Intended Use
This model can be used for generating Bengali speech from text. It is ideal for experimenting with TTS systems for Bengali, particularly for audiobooks, conversational AI, or speech synthesis tasks.
Model Training
- Dataset:
SUST - CSE - Speech/banspeech
(955 audiobook audio samples)
- Training Epochs: 10 epochs
- Hardware: Google Colab (single T4 GPU)
- Training Script: A modified Unsloth fine - tuning script was used for the training. The script is available on GitHub: [Orpheus TTS Training Script](https://github.com/asiff00/Training - TTS/blob/main/orpheus/orpheus.ipynb).
Limitations
- This model was trained on a small dataset and for a limited number of epochs, which may lead to less natural or less accurate speech synthesis.
- Since this is a proof - of - concept model, the synthesis quality may vary based on input text and different conditions. It is not optimized for production environments.
Training Resources
- [TTS Training: Style - TTS2](https://github.com/asiff00/Training - TTS/tree/main/style - tts2)
- [TTS Training: VIT - TTS](https://github.com/asiff00/Training - TTS/tree/main/vit - tts)
- [On - Device Speech - to - Speech Conversational AI](https://github.com/asiff00/On - Device - Speech - to - Speech - Conversational - AI)
- [Bangla Llama](https://github.com/asiff00/Bangla - Llama)
- [Bangla RAG Pipeline, PoRAG](https://github.com/Bangla - RAG/PoRAG)
📄 License
The model is licensed under the apache - 2.0
license.
Property |
Details |
Base Model |
canopylabs/orpheus - 3b - 0.1 - pretrained |
Tags |
transformers, llama, gguf, text - to - speech |
License |
apache - 2.0 |
Language |
bn |
Datasets |
SUST - CSE - Speech/banspeech |
Pipeline Tag |
text - to - speech |