Animagine XL 4.0 Zero開源圖像生成模型 - 免費支持高質量動漫圖像製作

首頁

Animagine Xl 4.0 Zero

由cagliostrolab開發

Animagine XL 4.0 Zero是基於Stable Diffusion XL 1.0微調的終極動漫主題文本生成圖像模型，使用840萬張動漫風格圖像訓練，支持高質量動漫圖像生成。

圖像生成英語#動漫風格生成 #高分辨率圖像 #SDXL微調

下載量 798

發布時間 : 2/13/2025

模型概述

該模型專門用於根據文本提示生成和修改動漫主題圖像，是LoRA訓練和進一步微調的理想基礎。

模型特點

大規模高質量訓練數據

使用840萬張多樣化動漫風格圖像訓練，知識截止日期為2025年1月7日

標籤排序訓練方法

採用標籤排序方法進行身份和風格訓練，提供更精確的控制

優化的提示結構

支持結構化提示輸入，包括角色、作品來源、評級和質量增強標籤

特殊標籤支持

支持質量標籤、評分標籤、年代標籤和分級標籤等多種特殊控制標籤

模型能力

動漫風格圖像生成

高質量細節渲染

風格控制

角色特徵保持

負面提示控制

使用案例

動漫創作

動漫角色生成

根據文本描述生成特定動漫角色圖像

高保真、細節豐富的角色圖像

動漫場景創作

生成特定風格和氛圍的動漫場景

風格一致的場景圖像

內容創作

動漫插畫創作

為故事或遊戲生成概念藝術和插畫

專業級動漫風格藝術作品

🚀 Animagine XL 4.0 Zero

Animagine XL 4.0 Zero 是一款終極動漫主題的微調 SDXL 模型，也是 Animagine XL 系列的最新版本。它能基於文本提示生成和修改動漫主題圖像，為動漫圖像創作提供強大支持。

🚀 快速開始

你可以通過以下幾種方式使用該模型：

在 Hugging Face Spaces 中使用此模型。
在 ComfyUI 或 Stable Diffusion Webui 中使用它。
使用 🧨 diffusers 庫來調用它。

✨ 主要特性

強大的動漫圖像生成能力：基於大規模的動漫風格圖像數據集進行訓練，能夠生成高質量、多樣化的動漫主題圖像。
可作為預訓練基礎模型：適合用於 LoRA 訓練和進一步的微調，為模型的定製化開發提供基礎。
支持多種特殊標籤：通過特殊標籤可以控制圖像生成的各個方面，如質量、風格、時間等。

📦 安裝指南

1. 安裝所需庫

pip install diffusers transformers accelerate safetensors --upgrade

2. 示例代碼

以下示例使用 lpw_stable_diffusion_xl 管道，它能更好地處理長、加權和詳細的提示。模型已以 FP16 格式上傳，因此在 from_pretrained 調用中無需指定 variant="fp16"。

import torch
from diffusers import StableDiffusionXLPipeline

pipe = StableDiffusionXLPipeline.from_pretrained(
    "cagliostrolab/animagine-xl-4.0-zero",
    torch_dtype=torch.float16,
    use_safetensors=True,
    custom_pipeline="lpw_stable_diffusion_xl",
    add_watermarker=False
)
pipe.to('cuda')

prompt = "1girl, arima kana, oshi no ko, hoshimachi suisei, hoshimachi suisei \(1st costume\), cosplay, looking at viewer, smile, outdoors, night, v, masterpiece, high score, great score, absurdres"
negative_prompt = "lowres, bad anatomy, bad hands, text, error, missing finger, extra digits, fewer digits, cropped, worst quality, low quality, low score, bad score, average score, signature, watermark, username, blurry"

image = pipe(
    prompt,
    negative_prompt=negative_prompt,
    width=832,
    height=1216,
    guidance_scale=6,
    num_inference_steps=25
).images[0]

image.save("./arima_kana.png")

💻 使用示例

基礎用法

import torch
from diffusers import StableDiffusionXLPipeline

# 加載模型
pipe = StableDiffusionXLPipeline.from_pretrained(
    "cagliostrolab/animagine-xl-4.0-zero",
    torch_dtype=torch.float16,
    use_safetensors=True,
    custom_pipeline="lpw_stable_diffusion_xl",
    add_watermarker=False
)
pipe.to('cuda')

# 設置提示詞和負提示詞
prompt = "1girl, cute, smile, outdoors"
negative_prompt = "lowres, bad anatomy"

# 生成圖像
image = pipe(
    prompt,
    negative_prompt=negative_prompt,
    width=832,
    height=1216,
    guidance_scale=6,
    num_inference_steps=25
).images[0]

# 保存圖像
image.save("./example.png")

高級用法

import torch
from diffusers import StableDiffusionXLPipeline

# 加載模型
pipe = StableDiffusionXLPipeline.from_pretrained(
    "cagliostrolab/animagine-xl-4.0-zero",
    torch_dtype=torch.float16,
    use_safetensors=True,
    custom_pipeline="lpw_stable_diffusion_xl",
    add_watermarker=False
)
pipe.to('cuda')

# 設置複雜提示詞和負提示詞
prompt = "1girl, arima kana, oshi no ko, hoshimachi suisei, hoshimachi suisei \(1st costume\), cosplay, looking at viewer, smile, outdoors, night, v, masterpiece, high score, great score, absurdres, year 2025"
negative_prompt = "lowres, bad anatomy, bad hands, text, error, missing finger, extra digits, fewer digits, cropped, worst quality, low quality, low score, bad score, average score, signature, watermark, username, blurry"

# 調整生成參數
width = 1216
height = 832
guidance_scale = 7
num_inference_steps = 28

# 生成圖像
image = pipe(
    prompt,
    negative_prompt=negative_prompt,
    width=width,
    height=height,
    guidance_scale=guidance_scale,
    num_inference_steps=num_inference_steps
).images[0]

# 保存圖像
image.save("./advanced_example.png")

📚 詳細文檔

使用指南

1. 提示詞結構

模型使用基於標籤的標題和標籤排序方法進行訓練。請使用以下結構化模板：

1girl/1boy/1other, 角色名稱, 所屬系列, 評級, 其他任意順序的描述，最後加上質量提升標籤

2. 質量提升標籤

在提示詞末尾添加以下標籤：

masterpiece, high score, great score, absurdres

3. 推薦的負提示詞

lowres, bad anatomy, bad hands, text, error, missing finger, extra digits, fewer digits, cropped, worst quality, low quality, low score, bad score, average score, signature, watermark, username, blurry

4. 最佳設置

CFG Scale：4 - 7（推薦 5）
採樣步數：25 - 28（推薦 28）
首選採樣器：Euler Ancestral (Euler a)

5. 推薦分辨率

方向	尺寸	縱橫比
方形	1024 x 1024	1:1
橫向	1152 x 896	9:7
	1216 x 832	3:2
	1344 x 768	7:4
	1536 x 640	12:5
縱向	896 x 1152	7:9
	832 x 1216	2:3
	768 x 1344	4:7
	640 x 1536	5:12

6. 最終提示詞結構示例

1girl, firefly \(honkai: star rail\), honkai \(series\), honkai: star rail, safe, casual, solo, looking at viewer, outdoors, smile, reaching towards viewer, night, masterpiece, high score, great score, absurdres

特殊標籤

模型支持各種特殊標籤，可用於控制圖像生成過程的不同方面。這些標籤經過精心加權和測試，以在不同提示詞下提供一致的結果。

質量標籤

質量標籤是直接影響圖像整體質量和細節水平的基本控制項。可用的質量標籤有：

masterpiece
best quality
low quality
worst quality


使用 `"masterpiece, best quality"` 質量標籤且負提示詞為空的示例圖像。	使用 `"low quality, worst quality"` 質量標籤且負提示詞為空的示例圖像。

分數標籤

與基本質量標籤相比，分數標籤能更細緻地控制圖像質量。它們在該模型中對引導輸出質量有更強的影響。可用的分數標籤有：

high score
great score
good score
average score
bad score
low score


使用 `"high score, great score"` 分數標籤且負提示詞為空的示例圖像。	使用 `"bad score, low score"` 分數標籤且負提示詞為空的示例圖像。

時間標籤

時間標籤允許你根據特定時間段或年份影響藝術風格。這對於生成具有特定時代藝術特徵的圖像非常有用。支持的年份標籤有：

year 2005
year {n}
year 2025


帶有 `"year 2007"` 時間標籤的初音未來示例圖像。	帶有 `"year 2023"` 時間標籤的初音未來示例圖像。

評級標籤

評級標籤有助於控制生成圖像的內容安全級別。應負責任地使用這些標籤，並遵守適用的法律和平臺政策。支持的評級有：

safe
sensitive
nsfw
explicit

🔧 技術細節

模型使用最先進的硬件和優化的超參數進行訓練，以確保輸出的最高質量。以下是訓練過程中使用的詳細技術規格和參數：

參數	值
硬件	7 x H100 80GB SXM5
圖像數量	8,401,464
UNet 學習率	2.5e-6
文本編碼器學習率	1.25e-6
調度器	Constant With Warmup
熱身步數	5%
批量大小	32
梯度累積步數	2
訓練分辨率	1024x1024
優化器	Adafactor
輸入擾動噪聲	0.1
無偏估計損失	啟用
混合精度	fp16

📄 許可證

本模型採用了 Stability AI 原始的 CreativeML Open RAIL++-M 許可證，未做任何修改或添加額外限制。許可證條款與原始 SDXL 許可證完全一致，包括：

✅ 允許：商業使用、修改、分發、私人使用
❌ 禁止：非法活動、生成有害內容、歧視、剝削
⚠️ 要求：包含許可證副本、說明更改、保留通知
📝 保證：“按原樣”提供，不提供保證

請參考原始 SDXL 許可證獲取完整和權威的條款和條件。

致謝

這個長期項目的成功離不開 Stability AI、Novel AI 和 Waifu Diffusion Team 的開創性工作、創新貢獻和全面文檔。我們特別感謝 Main 提供的啟動資金，使我們能夠在 V2 版本之後繼續推進項目。對於這個版本，我們衷心感謝社區中每個人的持續支持，特別是：