ArtiWaifu Diffusion 2.0 Open-Source Anime Image Generation Model - Supports a Wide Range of Styles and Characters

Artiwaifu Diffusion 2.0

Developed by Eugeoter

ArtiWaifu Diffusion 2.0 is a high-quality anime-style image generation model fine-tuned based on Stable Diffusion XL, supporting over 9,000 art styles and more than 6,000 anime characters.

Image Generation EnglishOpen Source License:Other #High-fidelity Anime Generation #Multi-character Scene Support #9000+ Art Styles

Downloads 141

Release Time : 8/29/2024

Model Overview

A text-to-image model specifically designed for generating high-quality anime images, excelling in producing highly recognizable styles and character images.

Model Features

Extensive Style and Character Coverage

Supports over 9,000 art styles and more than 6,000 anime characters, capable of generating highly accurate anime images.

High-Quality Aesthetic Expression

Through special training strategies, the generated images exhibit high-quality aesthetic performance, including details, colors, and composition.

Stable Anatomical Structure

Compared to version 1.0, version 2.0 provides more stable anatomical structures when generating characters, reducing deformation issues.

Model Capabilities

Generate high-quality anime images

Support multiple art styles

Generate specific anime characters

Support multi-character scene generation

Use Cases

Anime Art Creation

Character Design

Generate high-quality images of specific anime characters for use in character design and concept art.

Highly accurate character images with rich details and aesthetic performance.

Style Mixing

Generate unique anime images with mixed styles by overlaying multiple style tags.

Innovative images blending multiple art styles.

Multi-character Scene Generation

Dual Character Scenes

Generate interactive scene images featuring two or more characters.

Stable multi-character compositions with natural interactions.

🚀 ArtiWaifu Diffusion 2.0

We have released the ArtiWaifu Diffusion 2.0 model, which is designed to generate aesthetically pleasing and faithfully restored anime - style illustrations. It is an iteration of the Stable Diffusion XL model, mastering over 9000 artistic styles and more than 6000 anime characters. It can generate images through trigger words. As a specialized image - generation model for anime, it excels in producing high - quality anime images, especially those with highly recognizable styles and characters, while maintaining a consistently high - quality aesthetic expression.

alt text

✨ Features

More art styles and characters compared to ArtiWaifu Diffusion 1.0.
More stable anatomy in the generated images.

📦 Model Details

The AWA Diffusion model is fine - tuned from ArtiWaifu Diffusion 1.0, with a selected dataset of 2.5M high - quality anime images, covering a wide range of both popular and niche anime concepts. It employs our most advanced training strategies, enabling users to easily induce the model to generate images of specific characters or styles while maintaining high image quality and aesthetic expression.

Model Information

Property	Details
Developed by	Euge
Funded by	Neta.art
Model Type	Generative text - to - image model
Finetuned from model	[ArtiWaifu Diffusion 1.0](https://huggingface.co/Eugeoter/artiwaifu - diffusion - 1.0)
License	[Fair AI Public License 1.0 - SD](https://freedevproject.org/faipl - 1.0 - sd/)

💻 Usage Guide

This guide will (i) introduce the model's recommended usage methods and prompt - writing strategies, aiming to provide suggestions for generation, and (ii) serve as a reference document for model usage, detailing the writing patterns and strategies for trigger words, quality tags, rating tags, style tags, and character tags.

Basic Usage

CFG scale: 5 - 11
Resolution: Area (= width x height) around 1024x1024. Not lower than 256x256, and resolutions where both length and width are multiples of 32.
Sampling method: Euler A (20+ steps) or DPM++ 2M Karras (~35 steps)

Due to the special training method, AWA's optimal inference step count is higher than regular values. As the inference steps increase, the quality of the generated images can continue to improve...

⚠️ Important Note

Question: Why not use the standard SDXL resolution?

Answer: Because the bucketing algorithm used in training does not adhere to a fixed set of buckets. Although this does not conform to positional encoding, we have not observed any adverse effects.

Prompting Strategies

All text - to - image diffusion models have a notoriously high sensitivity to prompt, and AWA Diffusion is no exception. Even a misspelling in the prompt, or even replacing spaces with underscores, can affect the generated results.

AWA Diffusion encourages users to write prompt in tags separated by comma + space (, ). Although the model also supports natural language descriptions as prompt, or an intermix of both, the tag - by - tag format is more stable and user - friendly.

When describing a specific ACG concept, such as a character, style, or scene, we recommend users choose tags from the Danbooru tags and replace underscores in the Danbooru tags with spaces to ensure the model accurately understands your needs. For example, bishop_(chess) should be written as bishop (chess), and in inference tools like AUTOMATIC1111 WebUI that use parentheses to weight prompt, all parentheses within the tags should be escaped, i.e., bishop \(chess\).

Tag Ordering

Including AWA Diffusion, most diffusion models better understand logically ordered tags. While tag ordering is not mandatory, it can help the model better understand your needs. Generally, the earlier the tag in the order, the greater its impact on generation.

Here's an example of tag ordering: art style (by xxx) -> character (1 frieren (sousou no frieren)) -> race (elf) -> composition (cowboy shot) -> painting style (impasto) -> theme (fantasy theme) -> main environment (in the forest, at day) -> background (gradient background) -> action (sitting on ground) -> expression (expressionless) -> main characteristics (white hair) -> other characteristics (twintails, green eyes, parted lip) -> clothing (wearing a white dress) -> clothing accessories (frills) -> other items (holding a magic wand) -> secondary environment (grass, sunshine) -> aesthetics (beautiful color, detailed) -> quality (best quality) -> secondary description (birds, cloud, butterfly)

Tag order is not set in stone. Flexibility in writing prompt can yield better results. For example, if the effect of a concept (such as style) is too strong and detracts from the aesthetic appeal of the image, you can move it to a later position to reduce its impact.

Negative Prompt

Negative prompt are not necessary for AWA Diffusion. If you use negative prompt, it is not the case that the more negative prompt, the better. They should be as concise as possible and easily recognizable by the model. Too many negative words may lead to poorer generation results. Here are some recommended scenarios for using negative prompt:

Watermark: signature, logo, artist name;
Quality: worst quality, lowres, ugly, abstract;
Style: real life, 3d, celluloid, sketch, draft;
Human anatomy: deformed hand, fused fingers, extra limbs, extra arms, missing arm, extra legs, missing leg, extra digits, fewer digits.

Trigger Words

Add trigger words to your prompts to inform the model about the concept you want to generate. Trigger words can include character names, artistic styles, scenes, actions, quality, etc.

Tips for Trigger Word

Typos: The model is very sensitive to the spelling of trigger words. Even a single letter difference can cause a trigger to fail or lead to unexpected results.
Bracket Escaping: Pay attention when using inference tools that rely on parentheses for weighting prompt, such as AUTOMATIC1111 WebUI, to escape parentheses in trigger words, e.g., 1 lucy (cyberpunk) -> 1 lucy \(cyberpunk\).
Triggering Effect Preview: Through searching tags on Danbooru to preview the tag and better understand the tag's meaning and usage.

Style Tags

Style tags are divided into two types: Painting Style Tags and Artistic Style Tags. Painting Style Tags describe the painting techniques or media used in the image, such as oil painting, watercolor, flat color, and impasto. Artistic Style Tags represent the artistic style of the artist behind the image.

AWA Diffusion supports the following Painting Style Tags:

Painting style tags available in the Danbooru tags, such as oil painting, watercolor, flat color, etc.;
All painting style tags supported by [AID XL 0.8](https://civitai.com/models/124189/anime - illust - diffusion - xl), such as flat - pasto, etc.;
All style tags supported by [Neta Art XL 2.0](https://huggingface.co/neta - art/neta - xl - 2.0), such as gufeng, etc.;

See the [Painting Style Tags List](https://huggingface.co/Eugeoter/artiwaifu - diffusion - 1.0/blob/main/references/style.csv) for full lists of painting style tags.

AWA Diffusion supports the following Artistic Style Tags:

Artistic style tags available in the Danbooru tags, such as by yoneyama mai, by wlop, etc.;
All artistic style tags supported by [AID XL 0.8](https://civitai.com/models/124189/anime - illust - diffusion - xl), such as by antifreeze3, by 7thknights, etc.;
Some style tags mutually collected from Pixiv, such as by trickortreat, by shiroski, etc.;

See the [Artistic Style Tags List](https://huggingface.co/Eugeoter/artiwaifu - diffusion - 2.0/blob/main/references/artist.csv) for full lists of artistic style tags.

The higher the tag count in the tag repository, the more thoroughly the artistic style has been trained, and the higher the fidelity in generation. Typically, artistic style tags with a count higher than 50 yield better generation results.

Tips for Style Tag

Intensity Adjustment: You can adjust the intensity of a style by altering the order or weighting of style tags in your prompt. Frontloading a style tag enhances its effect, while placing it later reduces its effect.

⚠️ Important Note

Question: Why include the prefix by in artistic style tags?

Answer: To clearly inform the model that you want to generate a specific artistic style rather than something else, we recommend including the prefix by in artistic style tags. This differentiates by xxx from xxx, especially when xxx itself carries other meanings, such as dino which could represent either a dinosaur or an artist's identifier. Similarly, when triggering characters, add a 1 as a prefix to the character trigger word.

Character Tags

Character tags describe the character IP in the generated image. Using character tags will guide the model to generate the appearance features of the character.

Character tags also need to be sourced from the [Character Tag List](https://huggingface.co/Eugeoter/artiwaifu - diffusion - 2.0/blob/main/references/character.csv). To generate a specific character, first find the corresponding trigger word in the tag repository, replace all underscores _ in the trigger word with spaces , and prepend 1 to the character name. For example, 1 ayanami rei triggers the model to generate the character Rei Ayanami from the anime "EVA," corresponding to the Danbooru tag ayanami_rei; 1 asuna (sao) triggers the model to generate the character Asuna from "Sword Art Online," corresponding to the Danbooru tag asuna_(sao).

More examples

The higher the tag count in the tag repository, the more thoroughly the character has been trained, and the higher the fidelity in generation. Typically, character tags with a count higher than 100 yield better generation results.

Tips for Character Tag

Character Costuming: To achieve more flexible character costuming, character tags do not deliberately guide the model to draw the official attire of the character. To generate a character in a specific official outfit, besides the trigger word, you should also include a description of the attire in the prompt, e.g., "1 lucy (cyberpunk), wearing a white cropped jacket, underneath bodysuit, shorts, thighhighs, hip vent".
Series Annotations: Some character tags include additional parentheses annotations after the character name. The parentheses and the annotations within cannot be omitted, e.g., 1 lucy (cyberpunk) cannot be written as 1 lucy. Other than that, you don't need to add any additional annotations, for example, you DON'T need to add the series tag to which the character belongs after the character tag.
Known Issue 1: When generating certain characters, mysterious feature deformations may occur, e.g., 1 asui tsuyu triggering the character Tsuyu Asui from "My Hero Academia" may result in an extra black line between the eyes. This is because the model incorrectly interprets the large round eyes as glasses, thus glasses should be included in the negative prompt to avoid this issue.
Known Issue 2: When generating less popular characters, AWA Diffusion might produce images with incomplete feature restoration due to insufficient data/training. In such cases, we recommend that you extend the character description in your prompt beyond just the character name.

📄 License

This model is licensed under the [Fair AI Public License 1.0 - SD](https://freedevproject.org/faipl - 1.0 - sd/).

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご