Animagine XL 4.0开源图像生成模型 - 基于海量动漫图训练，稳定输出高质量动漫图

首页

Animagine Xl 4.0

由 cagliostrolab 开发

Animagine XL 4.0是基于Stable Diffusion XL 1.0微调的终极动漫主题文本生成图像模型，采用840万张动漫风格图像训练，显著提升输出稳定性和图像质量。

图像生成英语#动漫风格生成 #高分辨率优化 #标签排序控制

下载量 70.33k

发布时间 : 1/10/2025

模型简介

专为生成高质量动漫主题图像设计的文本驱动扩散模型，支持精细控制角色特征、艺术风格和图像质量。

模型特点

优化图像质量

通过新增数据集优化，显著提升输出稳定性、人体结构准确性和色彩表现

标签排序训练法

采用特殊提示结构训练，支持通过标签精确控制角色特征与艺术风格

多分辨率支持

提供从正方形(1024x1024)到超宽屏(1536x640)等9种预设分辨率方案

时代风格控制

支持通过年份标签(如2005年/2023年)生成不同年代的动漫艺术风格

模型能力

动漫角色生成

风格化图像合成

多标签条件控制

高分辨率输出(最高1536px)

使用案例

动漫创作

角色设计

生成原创动漫角色概念图

可精确控制发色、服装、表情等特征

同人作品

基于现有动漫角色生成衍生图像

支持跨作品风格融合(如初音未来2023年风格)

内容生产

插画创作

快速生成商业级动漫插画

支持多种构图比例和艺术风格

视觉概念开发

为游戏/动画项目生成概念艺术

可批量产出统一风格的角色/场景设计

🚀 Animagine XL 4.0

Animagine XL 4.0 是一款终极动漫主题的微调 SDXL 模型，也是 Animagine XL 系列的最新版本。它能够基于文本提示生成和修改动漫风格的图像，为动漫图像创作提供了强大的支持。

image/png

🚀 快速开始

你可以通过以下几种方式使用该模型：

在我们的 Hugging Face Spaces 中使用此模型。
在 ComfyUI 或 Stable Diffusion Webui 中使用它。
使用 🧨 diffusers 库来调用模型。

✨ 主要特性

强大的图像生成能力：基于大规模的动漫风格图像数据集进行训练，能够生成高质量、多样化的动漫主题图像。
持续优化：通过额外的数据集进一步优化了模型，提升了稳定性、解剖学准确性、降噪能力、色彩饱和度和整体色彩准确性。
支持特殊标签：支持各种特殊标签，可用于控制图像生成过程的不同方面，如质量、风格、时间等。

📦 安装指南

1. 安装所需库

pip install diffusers transformers accelerate safetensors --upgrade

2. 示例代码

以下示例使用 lpw_stable_diffusion_xl 管道，它能更好地处理长、加权和详细的提示。模型已以 FP16 格式上传，因此在 from_pretrained 调用中无需指定 variant="fp16"。

import torch
from diffusers import StableDiffusionXLPipeline

pipe = StableDiffusionXLPipeline.from_pretrained(
    "cagliostrolab/animagine-xl-4.0",
    torch_dtype=torch.float16,
    use_safetensors=True,
    custom_pipeline="lpw_stable_diffusion_xl",
    add_watermarker=False
)
pipe.to('cuda')

prompt = "1girl, arima kana, oshi no ko, hoshimachi suisei, hoshimachi suisei \(1st costume\), cosplay, looking at viewer, smile, outdoors, night, v, masterpiece, high score, great score, absurdres"
negative_prompt = "lowres, bad anatomy, bad hands, text, error, missing finger, extra digits, fewer digits, cropped, worst quality, low quality, low score, bad score, average score, signature, watermark, username, blurry"

image = pipe(
    prompt,
    negative_prompt=negative_prompt,
    width=832,
    height=1216,
    guidance_scale=5,
    num_inference_steps=28
).images[0]

image.save("./arima_kana.png")

💻 使用示例

基础用法

import torch
from diffusers import StableDiffusionXLPipeline

pipe = StableDiffusionXLPipeline.from_pretrained(
    "cagliostrolab/animagine-xl-4.0",
    torch_dtype=torch.float16,
    use_safetensors=True,
    custom_pipeline="lpw_stable_diffusion_xl",
    add_watermarker=False
)
pipe.to('cuda')

prompt = "1girl, arima kana, oshi no ko, hoshimachi suisei, hoshimachi suisei \(1st costume\), cosplay, looking at viewer, smile, outdoors, night, v, masterpiece, high score, great score, absurdres"
negative_prompt = "lowres, bad anatomy, bad hands, text, error, missing finger, extra digits, fewer digits, cropped, worst quality, low quality, low score, bad score, average score, signature, watermark, username, blurry"

image = pipe(
    prompt,
    negative_prompt=negative_prompt,
    width=832,
    height=1216,
    guidance_scale=5,
    num_inference_steps=28
).images[0]

image.save("./arima_kana.png")

高级用法

在高级场景中，你可以根据需要调整更多参数，如不同的提示、负提示、图像尺寸、引导比例和推理步数等，以获得不同风格和质量的图像。例如：

import torch
from diffusers import StableDiffusionXLPipeline

pipe = StableDiffusionXLPipeline.from_pretrained(
    "cagliostrolab/animagine-xl-4.0",
    torch_dtype=torch.float16,
    use_safetensors=True,
    custom_pipeline="lpw_stable_diffusion_xl",
    add_watermarker=False
)
pipe.to('cuda')

# 自定义提示和负提示
prompt = "1boy, male focus, green hair, sweater, looking at viewer, upper body, beanie, outdoors, night, turtleneck, masterpiece, high score, great score, absurdres"
negative_prompt = "lowres, bad anatomy, bad hands, text, error, missing finger, extra digits, fewer digits, cropped, worst quality, low quality, low score, bad score, average score, signature, watermark, username, blurry"

# 调整图像尺寸、引导比例和推理步数
image = pipe(
    prompt,
    negative_prompt=negative_prompt,
    width=1024,
    height=1024,
    guidance_scale=6,
    num_inference_steps=30
).images[0]

image.save("./custom_image.png")

📚 详细文档

使用指南

提示指南的总结可在图像中查看。 image/png

1. 提示结构

该模型使用基于标签的标题和标签排序方法进行训练。请使用以下结构化模板：

1girl/1boy/1other, 角色名称, 所属系列, 评级, 其他任意顺序的内容，并以质量增强标签结尾

2. 质量增强标签

在提示末尾添加以下标签：

masterpiece, high score, great score, absurdres

3. 推荐的负提示

lowres, bad anatomy, bad hands, text, error, missing finger, extra digits, fewer digits, cropped, worst quality, low quality, low score, bad score, average score, signature, watermark, username, blurry

4. 最佳设置

CFG 比例：4 - 7（推荐 5）
采样步数：25 - 28（推荐 28）
首选采样器：Euler Ancestral（Euler a）

5. 推荐分辨率

方向	尺寸	纵横比
正方形	1024 x 1024	1:1
横向	1152 x 896	9:7
	1216 x 832	3:2
	1344 x 768	7:4
	1536 x 640	12:5
纵向	896 x 1152	7:9
	832 x 1216	2:3
	768 x 1344	4:7
	640 x 1536	5:12

6. 最终提示结构示例

1girl, firefly \(honkai: star rail\), honkai \(series\), honkai: star rail, safe, casual, solo, looking at viewer, outdoors, smile, reaching towards viewer, night, masterpiece, high score, great score, absurdres

特殊标签

该模型支持各种特殊标签，可用于控制图像生成过程的不同方面。这些标签经过精心加权和测试，以在不同提示下提供一致的结果。

质量标签

质量标签是直接影响整体图像质量和细节水平的基本控制项。可用的质量标签有：

masterpiece
best quality
low quality
worst quality


使用 `"masterpiece, best quality"` 质量标签且负提示为空的示例图像。	使用 `"low quality, worst quality"` 质量标签且负提示为空的示例图像。

分数标签

与基本质量标签相比，分数标签能更细致地控制图像质量。它们对引导此模型的输出质量有更强的影响。可用的分数标签有：

high score
great score
good score
average score
bad score
low score


使用 `"high score, great score"` 分数标签且负提示为空的示例图像。	使用 `"bad score, low score"` 分数标签且负提示为空的示例图像。

时间标签

时间标签允许你根据特定时间段或年份影响艺术风格。这对于生成具有特定时代艺术特征的图像很有用。支持的年份标签有：

year 2005
year {n}
year 2025


带有 `"year 2007"` 时间标签的初音未来示例图像。	带有 `"year 2023"` 时间标签的初音未来示例图像。

评级标签

评级标签有助于控制生成图像的内容安全级别。这些标签应负责任地使用，并符合适用的法律和平台政策。支持的评级有：

safe
sensitive
nsfw
explicit

🔧 技术细节

该模型使用了最先进的硬件和优化的超参数进行训练，以确保输出的最高质量。以下是训练过程中使用的详细技术规格和参数：

参数	值
硬件	7 x H100 80GB SXM5
图像数量	8,401,464
UNet 学习率	2.5e-6
文本编码器学习率	1.25e-6
调度器	Constant With Warmup
热身步数	5%
批量大小	32
梯度累积步数	2
训练分辨率	1024x1024
优化器	Adafactor
输入扰动噪声	0.1
去偏估计损失	启用
混合精度	fp16

📄 许可证

该模型采用了 Stability AI 的原始 CreativeML Open RAIL++-M 许可证，未做任何修改或添加额外限制。许可证条款与原始 SDXL 许可证中规定的完全一致，包括：

✅ 允许：商业使用、修改、分发、私人使用
❌ 禁止：非法活动、有害内容生成、歧视、剥削
⚠️ 要求：包含许可证副本、说明更改、保留声明
📝 保证：“按原样”提供，不提供保证

请参考原始 SDXL 许可证获取完整和权威的条款和条件。

致谢

如果没有 Stability AI、Novel AI 和 Waifu Diffusion Team 的开创性工作、创新贡献和全面文档，这个长期项目是不可能实现的。我们特别感谢 Main 提供的启动资金，使我们能够在 V2 版本之后继续推进。对于这个版本，我们衷心感谢社区中每个人的持续支持，特别是：