đ RouWei-0.7: A Powerful Text-to-Image Model
RouWei-0.7 is a large-scale finetuned model based on Illustrious, leveraging state-of-the-art techniques to achieve outstanding performance. It uses a carefully selected and balanced dataset of 7M unique pictures, offering high-quality anime image generation.

đ Quick Start
This model is designed for text-to-image generation, capable of working with both short booru tag-based and long complex natural text prompts. You can achieve the best results by combining tags and natural text phrases.
⨠Features
Key Advantages
- Better Prompt Following: It can accurately understand and follow prompts.
- Great Aesthetic and Stability: Offers excellent aesthetic, anatomy, and stability, along with high versatility.
- Vibrant Colors: Generates images with vibrant colors and smooth gradients without burning effects.
- Full Brightness Range: Maintains a full brightness range even with epsilon.
- Rich Knowledge: Has knowledge of tens of thousands of styles and almost any character.
Improvements over Vanilla Illustrious and NoobAI
- No Watermarks: Eliminates annoying watermarks.
- Better Prompt Segmentation: Avoids tag bleed and improves prompt segmentation.
- No Character Tags Bleed: Prevents related side effects such as unwanted outfits, style, and composition changes.
- Better Coherence and Anatomy: Ensures better coherence and anatomy in generated images.
- Accurate Artist Styles: Reproduces artist styles exactly as they should be.
- Stable Styles: Each style, including the base, is stable without random fluctuations on different seeds.
- New Knowledge: Incorporates new knowledge for better generation.
đĻ Installation
No specific installation steps are provided in the original README.
đģ Usage Examples
Basic Settings
- Resolution: ~1 megapixel for txt2img, any AR with resolution multiple of 64 (e.g., 1024x1024, 1152x, 1216x832,...).
- Sampler: Euler_a.
- CFG: 4..8 for epsilon/3..5 for vpred.
- Steps: 20..28 steps.
- Highresfix: x1.5 latent + denoise 0.6 or any gan + denoise 0.3..0.55.
â ī¸ Important Note
The vpred version requires a lower CFG value.
Examples can be found in the repo and more on civitai.
Quality Tags
- Positive:
masterpiece, best quality
- Negative:
low quality, worst quality
đĄ Usage Tip
Keep the negative prompt as clean as possible. Spamming popular sequences will not improve results and may lead to unwanted effects.
Artist Styles
The model knows over 35k artist styles. You can find the list and grids with examples on Mega. Use them with by
, otherwise, they won't work properly.
General Styles
2.5d, anime screencap, bold line, sketch, cgi, digital painting, flat colors, smooth shading, minimalistic, ink style, oil style, pastel style
Natural Text
You can use natural text in combination with booru tags. About 2M pictures from the dataset have hybrid natural-text captions. Version 0.7 has improvements in prompt understanding and segmentation. For best performance, keep track of CLIP 75 token chunks.
Brightness/Colors/Contrast
You can use extra meta tags to control these aspects:
low brightness, high brightness, low saturation, high saturation, low gamma, high gamma, sharp colors, soft colors, hdr, sdr
Vpred Version
The Vpred version of RouWei-0.7 is now available. It works well out of the box without burning issues. Use a lower CFG (3..5), and other generation settings are the same. Avoid using some experimental samplers designed to reduce burning as they may lead to low-quality images.
đ Documentation
Dataset
The dataset consists of 7M unique pictures (~2M with natural text captions) picked and balanced from 14M of anime art and other media, including private datasets. The dataset cut-off is 20th December 2024. More detailed description on Civitai
Prompting
The model can work with both short booru tag-based and long complex natural text prompts. For tags, classic danbooru-style comma-separated tags without underscores are used.
đ§ Technical Details
No specific technical details are provided in the original README.
đ License
The license is the same as illustrious. Please check the original page for limitations. You are free to use it in your merges, finetunes, etc., but please leave a link.
Thanks
Thanks to a number of anonymous persons, Bakariso, dga, Fi., ello, K., LOL2024, NeuroSenko, rred, Soviet Cat, Sv1., T., and other fellow brothers for their help.
Donations
- BTC: bc1qwv83ggq8rvv07uk6dv4njs0j3yygj3aax4wg6c
- ETH/USDT(e): 0x04C8a749F49aE8a56CB84cF0C99CD9E92eDB17db
- XMR: 47F7JAyKP8tMBtzwxpoZsUVB8wzg2VrbtDKBice9FAS1FikbHEXXPof4PAb42CQ5ch8p8Hs4RvJuzPHDtaVSdQzD6ZbA5TZ
Property |
Details |
Model Type |
Text-to-Image |
Training Data |
7M unique pictures (~2M with natural text captions) picked and balanced from 14M of anime art and other media, including private datasets |
Base Model |
Minthy/RouWei-0.6 |
Library Name |
diffusers |
Pipeline Tag |
text-to-image |
Tags |
anime |