RouWei-0.6 Open-source Text-to-Image Model - Free Deployment to Generate Beautiful Anime-style Images

Rouwei 0.6

Developed by Minthy

RouWei-0.6 is a text-to-image model fine-tuned extensively on Illustrious-xl-early-release-v0, specializing in anime-style image generation with exceptional prompt-following capability and aesthetic performance.

Image Generation English#High-precision anime generation #Multi-style compatibility #Natural language understanding

Downloads 36

Release Time : 12/9/2024

Model Overview

Trained on 4.5 million carefully selected images (including 800,000 with natural text descriptions), the model optimizes prompt adherence, color representation, and style stability for high-quality anime-style image generation.

Model Features

Precise Prompt Following

The optimized model more accurately understands and executes complex prompts.

Aesthetic Performance

Excellent color representation, smooth gradients, and stable human anatomy.

Style Diversity

Mastery of tens of thousands of artist styles and generic styles with stable performance.

No Distracting Elements

Eliminates watermarks and tag infiltration issues for cleaner outputs.

Model Capabilities

Anime-style image generation

Artist style imitation

Natural text understanding

High-quality image rendering

Use Cases

Art Creation

Anime Character Design

Generate anime characters in various styles based on text descriptions.

High-quality, style-consistent anime character images.

Scene Creation

Generate complex anime scenes and backgrounds.

Well-composed and color-rich scene images.

Concept Design

Rapid Prototyping

Quickly generate concept art for game or animation projects.

Diverse design options and style variants.

🚀 Minthy/RouWei-0.6

This project is a large - scale fine - tuned version of the Illustrious model, leveraging state - of - the - art techniques to achieve excellent text - to - image generation performance.

🚀 Quick Start

This model is a large - scale fine - tuned version of Illustrious, using state - of - the - art techniques. A dataset of 4.5M pictures (0.8M with natural text captions) was selected and balanced from 12M anime art and other media, including private datasets. More detailed description on Civitai

✨ Features

Key Advantages

Better Prompt Following: The model can more accurately follow user prompts.
Great Aesthetic, Anatomy, Stability and Versatility: It offers high - quality, well - structured, and diverse outputs.
Vibrant Colors and Smooth Gradients: Produces vivid colors and smooth gradients without burning effects.
Full Brightness Range: Maintains a full brightness range even with epsilon.
Rich Knowledge: Has knowledge of tens of thousands of styles and almost any character.

Comparison with Vanilla Illustrious and NoobAI

No Annoying Watermarks: Eliminates the problem of watermarks.
Better Prompt Segmentation: Avoids tag bleed and improves prompt segmentation.
No Character Tags Bleed: Prevents related side - effects such as unwanted outfits, style, and composition changes.
Better Coherence and Anatomy: Ensures better overall coherence and anatomical accuracy.
Accurate Artist Styles: Reproduces artist styles as they should be.
Stable Styles: Each style, including the base style, is stable without random fluctuations on different seeds.
New Knowledge: Incorporates new knowledge.

Features and Prompting

The model is designed to work with both short booru tag - based and long complex natural text prompts. The best results can be achieved by combining tags and natural text phrases. Classic danbooru - style comma - separated tags without underscores were used for tags.

Basic Settings

Image Resolution: ~1 megapixel for txt2img, any AR with resolution multiple of 64 (e.g., 1024x1024, 1152x, 1216x832,...).
Sampler: Euler_a.
CFG: 4..8 for epsilon/3..5 for vpred.
Steps: 20..28 steps. LCM/PCM untested, cfg++ samplers work fine.
Highresfix: x1.5 latent + denoise 0.6 or any gan + denoise 0.3..0.55.

⚠️ Important Note

Please note that the vpred version requires a lower CFG value.

Examples can be found in the image folder in the repo.

Quality Tags

There are only 4 quality tags:

Positive: masterpiece, best quality
Negative: low quality, worst quality

Meta tags like lowres have been removed, so do not use them. Low - resolution images have been either removed or upscaled and cleaned with DAT depending on their importance.

Negative Prompt

worst quality, low quality, watermark

💡 Usage Tip

For best results, keep the negative prompt as clean as possible. Spamming popular sequences will not improve results, as all related flaws have been solved, but will only lead to unwanted effects, biases, and poor quality.

Artist Styles

The model knows over 22k artist styles. List, grids with example on Mega. Use with "by ", it will not work properly without it.

The 0.6.1 vpred version also has the following styles: by nyalia, by flooxyfloox, by koni, by truck - kun, by 748cm, by galawave, by aruhshura, by kyomu, by youlichu, by alens, by chlenix, by cleandongye, by fltccktl, by merratatustle, by xi410, by youmuanon, by memento mori

General Styles

2.5d, anime screencap, bold line, sketch, cgi, digital painting, flat colors, smooth shading, minimalistic, ink style, oil style, pastel style

Natural Text

Use natural text in combination with booru tags. It works great. Type styles and quality tags first, then use natural text. You can also just use booru tags. The dataset contains over 800k pictures with hybrid natural - text captions made by Opus - Vision, GPT - 4o, and ToriiGate.

Brightness/Colors/Contrast

You can use extra meta tags to control brightness, colors, and contrast: low brightness, high brightness, low gamma, high gamma, sharp colors, soft colors, hdr, sdr, limited range

Vpred Version

The vpred version has index 0.6.1 because it was retrained from the base to fix observed flaws and now works flawlessly. To use it, you need a latest dev build of a1111 or comfy or reforge. Remember to lower your CFG to 3..5, as higher values will lead to over - saturation.

📚 Documentation

![image](https://huggingface.co/Minthy/RouWei-0.6/resolve/main/images/alltogether.jpg)

📄 License

The license is the same as illustrious. Please check the original page for limitations. You are free to use it in your merges, finetunes, etc., but please leave a link.

📦 Donations

BTC: bc1qwv83ggq8rvv07uk6dv4njs0j3yygj3aax4wg6c
ETH/USDT(e): 0x04C8a749F49aE8a56CB84cF0C99CD9E92eDB17db

🌐 Discord Server

join

⚠️ Safety

The model tends to generate NSFW images for corresponding prompts. Consider adding extra filtering. Outputs may be inaccurate and provocative and must not be used as a reference.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご