German text-to-speech model supporting four speakers (Lena, Anna, Max, Tom), combining professional recordings with synthetic data
Model Features
Multi-Speaker Support
Offers four distinct German speakers with different styles (Lena/Anna/Max/Tom)
High-Quality Data Fusion
Combines professional microphone recordings with Mimic Studio synthetic data, totaling over 18 hours of training
Expressiveness Control
Adjust output stability and creativity via temperature parameter (recommended around 0.5)
Model Capabilities
German Text-to-Speech
Multi-Speaker Voice Synthesis
Voice Expressiveness Adjustment
Use Cases
Voice Interaction
German Voice Assistant
Provides natural voice interaction for German users
Generates clear and natural response voices
Content Creation
Audio Content Production
Quickly generates German podcast/audiobook voiceovers
Supports different speaker style selections
🚀 SauerkrautTTS-Preview-0.1
SauerkrautTTS-Preview-0.1 is a fine - tuned Text - to - Speech (TTS) model. It's based on the powerful canopylabs/orpheus-3b-0.1-ft. This model offers four distinct German - speaking voices, enabling clear and natural speech outputs.
🚀 Quick Start
For seamless inference and practical examples, check out our detailed instructions and ready - to - use scripts available on:
This preview model introduces four distinct German - speaking voices: Lena, Anna, Max, and Tom.
These voices are crafted using original audio recordings captured with a Rhode Studio microphone and Mimic Studio, alongside carefully curated synthetic data.
The high - quality and well - curated dataset enables the model to produce clear and natural speech outputs, even in this initial release.
The synthetic audio data enriches the model, resulting in versatile and expressive voice capabilities.
Example Output and Comparison
💻 Usage Examples
Basic Usage
temperature = 0.5# Adjust lower for clearer output, higher for creativity
Advanced Usage
To achieve optimal results, we recommend using a lower temperature for clear and stable outputs. Higher temperatures will enhance dynamism and expressiveness but might introduce instability.
🔮 Future Plans
This model represents our first exploratory step into advanced German - language TTS. Expect significant improvements in upcoming versions, including:
Enhanced voice clarity
Expanded speaker diversity
Greater stability across temperature ranges
Stay tuned for future releases and updates!
📄 License
SauerkrautTTS - Preview - 0.1 is openly available under the [CC BY - NC 4.0 License](https://creativecommons.org/licenses/by - nc/4.0/), encouraging reuse, remixing, and improvements by the community.
🙏 Acknowledgments
We thank Unsloth for their invaluable training script, which we utilized in a lightly modified form for training this model.