Animagine XL 4.0 Zero Open-source Image Generation Model - Free Support for High-quality Anime Image Production

Animagine Xl 4.0 Zero

Developed by cagliostrolab

Animagine XL 4.0 Zero is the ultimate anime-themed text-to-image model fine-tuned on Stable Diffusion XL 1.0, trained on 8.4 million anime-style images to support high-quality anime image generation.

Image Generation English#Anime-style generation #High-resolution images #SDXL fine-tuning

Downloads 798

Release Time : 2/13/2025

Model Overview

This model is specifically designed for generating and modifying anime-themed images based on text prompts, making it an ideal foundation for LoRA training and further fine-tuning.

Model Features

Large-scale high-quality training data

Trained on 8.4 million diverse anime-style images with a knowledge cutoff date of January 7, 2025

Tag ranking training method

Employs a tag ranking approach for identity and style training, providing more precise control

Optimized prompt structure

Supports structured prompt inputs including character, source work, rating, and quality enhancement tags

Special tag support

Supports various special control tags such as quality tags, rating tags, era tags, and classification tags

Model Capabilities

Anime-style image generation

High-quality detail rendering

Style control

Character feature preservation

Negative prompt control

Use Cases

Anime creation

Anime character generation

Generate specific anime character images based on text descriptions

High-fidelity, detailed character images

Anime scene creation

Generate anime scenes with specific styles and atmospheres

Stylistically consistent scene images

Content creation

Anime illustration creation

Generate concept art and illustrations for stories or games

Professional-level anime-style artwork

🚀 Animagine XL 4.0 Zero

The ultimate anime-themed finetuned SDXL model, delivering high-quality anime-style image generation.

🚀 Quick Start

Animagine XL 4.0 Zero is a powerful anime-themed text-to-image model. You can use it in various ways, such as through Hugging Face Spaces, ComfyUI, Stable Diffusion Webui, or with the diffusers library.

✨ Features

Anime Themed: Based on a massive dataset of 8.4M diverse anime-style images.
Finetuned SDXL: Retrained from Stable Diffusion XL 1.0 for better performance.
Pretrained Base Model: Ideal for LoRA training and further finetuning.
Support Special Tags: Allows control of image generation through various special tags.

📦 Installation

🧨 Diffusers Installation

1. Install Required Libraries

pip install diffusers transformers accelerate safetensors --upgrade

2. Example Code

The following example uses the lpw_stable_diffusion_xl pipeline, which can better handle long, weighted, and detailed prompts. The model is already in FP16 format, so there's no need to specify variant="fp16" in the from_pretrained call.

import torch
from diffusers import StableDiffusionXLPipeline

pipe = StableDiffusionXLPipeline.from_pretrained(
    "cagliostrolab/animagine-xl-4.0-zero",
    torch_dtype=torch.float16,
    use_safetensors=True,
    custom_pipeline="lpw_stable_diffusion_xl",
    add_watermarker=False
)
pipe.to('cuda')

prompt = "1girl, arima kana, oshi no ko, hoshimachi suisei, hoshimachi suisei \(1st costume\), cosplay, looking at viewer, smile, outdoors, night, v, masterpiece, high score, great score, absurdres"
negative_prompt = "lowres, bad anatomy, bad hands, text, error, missing finger, extra digits, fewer digits, cropped, worst quality, low quality, low score, bad score, average score, signature, watermark, username, blurry"

image = pipe(
    prompt,
    negative_prompt=negative_prompt,
    width=832,
    height=1216,
    guidance_scale=6,
    num_inference_steps=25
).images[0]

image.save("./arima_kana.png")

💻 Usage Examples

Basic Usage

The model was trained with tag-based captions and the tag-ordering method. Use the following structured template:

1girl/1boy/1other, character name, from which series, rating, everything else in any order and end with quality enhancement

Add quality enhancement tags at the end of your prompt:

masterpiece, high score, great score, absurdres

Recommended negative prompt:

lowres, bad anatomy, bad hands, text, error, missing finger, extra digits, fewer digits, cropped, worst quality, low quality, low score, bad score, average score, signature, watermark, username, blurry

Advanced Usage

Optimal settings:

CFG Scale: 4 - 7 (5 Recommended)
Sampling Steps: 25 - 28 (28 Recommended)
Preferred Sampler: Euler Ancestral (Euler a)

Recommended resolutions:

Orientation	Dimensions	Aspect Ratio
Square	1024 x 1024	1:1
Landscape	1152 x 896	9:7
	1216 x 832	3:2
	1344 x 768	7:4
	1536 x 640	12:5
Portrait	896 x 1152	7:9
	832 x 1216	2:3
	768 x 1344	4:7
	640 x 1536	5:12

Final prompt structure example:

1girl, firefly \(honkai: star rail\), honkai \(series\), honkai: star rail, safe, casual, solo, looking at viewer, outdoors, smile, reaching towards viewer, night, masterpiece, high score, great score, absurdres

📚 Documentation

Special Tags

The model supports various special tags to control different aspects of image generation:

Quality Tags: masterpiece, best quality, low quality, worst quality
Score Tags: high score, great score, good score, average score, bad score, low score
Temporal Tags: year 2005, year {n}, year 2025
Rating Tags: safe, sensitive, nsfw, explicit

Training Information

Parameter	Value
Hardware	7 x H100 80GB SXM5
Num Images	8,401,464
UNet Learning Rate	2.5e-6
Text Encoder Learning Rate	1.25e-6
Scheduler	Constant With Warmup
Warmup Steps	5%
Batch Size	32
Gradient Accumulation Steps	2
Training Resolution	1024x1024
Optimizer	Adafactor
Input Perturbation Noise	0.1
Debiased Estimation Loss	Enabled
Mixed Precision	fp16

🔧 Technical Details

The model was retrained from Stable Diffusion XL 1.0 with a massive dataset of 8.4M diverse anime-style images. It was trained using state-of-the-art hardware and optimized hyperparameters for approximately 2650 GPU hours.

📄 License

This model adopts the original CreativeML Open RAIL++-M License from Stability AI.

✅ Permitted: Commercial use, modifications, distributions, private use
❌ Prohibited: Illegal activities, harmful content generation, discrimination, exploitation
⚠️ Requirements: Include license copy, state changes, preserve notices
📝 Warranty: Provided "AS IS" without warranties

Acknowledgement

This project is made possible thanks to the contributions of Stability AI, Novel AI, and Waifu Diffusion Team. We're also grateful for the kickstarter grant from Main and the support from the community. Special thanks to:

Moescape AI: Our collaboration partner in model distribution and testing
Lesser Rabbit: For providing computing and research grants
Kohya SS: For developing the open-source training framework
discus0434: For creating the Aesthetic Predictor 2.5
Early testers: For providing feedback and quality assurance

Contributors

Model

Gradio

Damar Jati

Relations, finance, and quality assurance

Data

Fundraising

We've introduced new fundraising methods through GitHub Sponsors. You can support us in the following ways:

Donate: Contribute via ETH, USDT, or USDC to 0xd8A1dA94BA7E6feCe8CfEacc1327f498fCcBFC0C or sponsor us on GitHub.
Share: Spread the word about our models.
Feedback: Let us know how we can improve.

Why do we use Cryptocurrency?

Our PayPal account was banned when we used Ko-fi and PayPal for fundraising. To ensure transparency, we've switched to cryptocurrency.

Want to Donate in Non-Crypto Currency?

If you prefer non-crypto donation, contact us via our Discord Server or GitHub Sponsors.

Join Our Discord Server

Join our discord server: https://discord.gg/cqh9tZgbGc

Limitations

Prompt Format: Limited to tag-based text prompts.
Anatomy: May struggle with complex anatomical details.
Text Generation: Text rendering in images is not supported.
New Characters: Recent characters may have lower accuracy.
Multiple Characters: Scenes with multiple characters need careful prompt engineering.
Resolution: Higher resolutions may show degradation.
Style Consistency: May require specific style tags.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご