A lightweight model focused on Japanese text-to-speech, with control prompts removed for subsequent fine-tuning, core architecture based on llama for technical migration
Model Features
Streamlined parameter design
Achieves efficient parameter configuration by removing the control prompt layer
LLM-compatible architecture
Based on llama architecture for easy migration of large language model technologies
Audio quality optimization
Utilizes OuteAI's efficient audio decoder for voice synthesis
Model Capabilities
Japanese speech synthesis
Random voice generation
Specified voice fine-tuning
Use Cases
Voice interaction
Virtual assistant voice
Provides basic speech synthesis capabilities for Japanese virtual assistants
Basic audio quality is rough but can be improved through fine-tuning
Content creation
Audio content generation
Automatically converts Japanese text into speech content
Requires subsequent fine-tuning for better results
🚀 Canary-TTS-0.5B
Canary-TTS-0.5B is a Text-to-Speech (TTS) base model trained on the foundation of llm-jp/llm-jp-3-150m-instruct3. By removing control prompts for potential further training, the number of parameters has been reduced.
Disclaimer of Appropriateness: The creator makes no warranties regarding the accuracy, legality, or appropriateness of the results obtained from using this model.
User Responsibility: When using this model, please comply with all applicable laws and regulations. All responsibilities arising from the generated content shall be borne by the user.
Creator's Disclaimer: The creator of this repository and the model shall not be liable for any copyright infringement or other legal issues.
Response to Deletion Requests: In the event of a copyright issue, the problematic resources or data will be promptly deleted.