anime-painter開源模型 - 憑動漫草圖生成極高質量圖像，線條類型寬度不限

Home

Anime Painter

Developed by xinsir

基於controlnet-scribble-sdxl-1.0的模型，能夠通過動漫草圖生成極高質量的圖像，支持任何類型和寬度的線條。

圖像生成 Open Source License:Apache-2.0 #動漫草圖生成 #高質量圖像 #塗鴉轉插畫

Downloads 1,443

Release Time : 5/12/2024

Model Overview

該模型允許用戶通過簡單的草圖生成高質量的動漫圖像，特別適合不懂繪畫的人快速創作動漫插畫。

Model Features

高質量動漫圖像生成

能夠通過簡單甚至模糊的草圖生成極高質量的動漫圖像。

支持多種線條類型

支持任何類型和寬度的線條，包括手繪草圖。

強大的提示跟隨能力

能夠根據Danbooru標籤和自然語言描述生成符合要求的圖像。

低圖像畸變率

生成異常人體結構的概率顯著降低。

Model Capabilities

動漫圖像生成

草圖轉圖像

文本到圖像轉換

高質量圖像渲染

Use Cases

動漫創作

角色設計

通過簡單草圖和標籤生成動漫角色設計。

生成視覺吸引力極強的動漫角色圖像。

場景創作

通過草圖和詳細描述生成動漫場景。

生成符合草圖輪廓且語義合理的動漫場景。

藝術創作輔助

非專業用戶創作

幫助不懂繪畫的人快速創作動漫插畫。

即使草圖簡單模糊，也能生成精美圖像。

🚀 Controlnet-scribble-sdxl-1.0-anime

本模型能讓每個人都成為動漫畫家，即使你對繪畫一無所知。它藉助動漫草圖生成高質量圖像，支持各種線條類型和寬度，讓創作變得簡單又有趣。

日落圖像

🚀 快速開始

Controlnet-scribble-sdxl-1.0-anime 是一款基於控制網絡的模型，能依據動漫草圖生成高質量圖像。它支持各種類型和寬度的線條，即使草圖簡單模糊，也能生成精美的動漫插畫。

運行代碼示例

from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL
from diffusers import DDIMScheduler, EulerAncestralDiscreteScheduler
from controlnet_aux import PidiNetDetector, HEDdetector
from diffusers.utils import load_image
from huggingface_hub import HfApi
from pathlib import Path
from PIL import Image
import torch
import numpy as np
import cv2
import os


def nms(x, t, s):
    x = cv2.GaussianBlur(x.astype(np.float32), (0, 0), s)

    f1 = np.array([[0, 0, 0], [1, 1, 1], [0, 0, 0]], dtype=np.uint8)
    f2 = np.array([[0, 1, 0], [0, 1, 0], [0, 1, 0]], dtype=np.uint8)
    f3 = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]], dtype=np.uint8)
    f4 = np.array([[0, 0, 1], [0, 1, 0], [1, 0, 0]], dtype=np.uint8)

    y = np.zeros_like(x)

    for f in [f1, f2, f3, f4]:
        np.putmask(y, cv2.dilate(x, kernel=f) == x, x)

    z = np.zeros_like(y, dtype=np.uint8)
    z[y > t] = 255
    return z


controlnet_conditioning_scale = 1.0  
prompt = "your prompt, the longer the better, you can describe it as detail as possible"
negative_prompt = 'longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality'


eulera_scheduler = EulerAncestralDiscreteScheduler.from_pretrained("gsdf/CounterfeitXL", subfolder="scheduler")


controlnet = ControlNetModel.from_pretrained(
    "xinsir/anime-painter",
    torch_dtype=torch.float16
)

# when test with other base model, you need to change the vae also.
vae = AutoencoderKL.from_pretrained("gsdf/CounterfeitXL", subfolder="vae", torch_dtype=torch.float16)

pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
    "gsdf/CounterfeitXL",
    controlnet=controlnet,
    vae=vae,
    safety_checker=None,
    torch_dtype=torch.float16,
    scheduler=eulera_scheduler,
)

# you can use either hed to generate a fake scribble given an image or a sketch image totally draw by yourself
if random.random() > 0.5:
  # Method 1 
  # if you use hed, you should provide an image, the image can be real or anime, you extract its hed lines and use it as the scribbles
  # The detail about hed detect you can refer to https://github.com/lllyasviel/ControlNet/blob/main/gradio_fake_scribble2image.py
  # Below is a example using diffusers HED detector

  image_path = Image.open("your image path, the image can be real or anime, HED detector will extract its edge boundery")
  processor = HEDdetector.from_pretrained('lllyasviel/Annotators')
  controlnet_img = processor(image_path, scribble=False)
  controlnet_img.save("a hed detect path for an image")

  # following is some processing to simulate human sketch draw, different threshold can generate different width of lines
  controlnet_img = np.array(controlnet_img)
  controlnet_img = nms(controlnet_img, 127, 3)
  controlnet_img = cv2.GaussianBlur(controlnet_img, (0, 0), 3)

  # higher threshold, thiner line
  random_val = int(round(random.uniform(0.01, 0.10), 2) * 255)
  controlnet_img[controlnet_img > random_val] = 255
  controlnet_img[controlnet_img < 255] = 0
  controlnet_img = Image.fromarray(controlnet_img)

else:
  # Method 2
  # if you use a sketch image total draw by yourself
  control_path = "the sketch image you draw with some tools, like drawing board, the path you save it"
  controlnet_img = Image.open(control_path) # Note that the image must be black-white(0 or 255), like the examples we list

# must resize to 1024*1024 or same resolution bucket to get the best performance
width, height  = controlnet_img.size
ratio = np.sqrt(1024. * 1024. / (width * height))
new_width, new_height = int(width * ratio), int(height * ratio)
controlnet_img = controlnet_img.resize((new_width, new_height))

images = pipe(
    prompt,
    negative_prompt=negative_prompt,
    image=controlnet_img,
    controlnet_conditioning_scale=controlnet_conditioning_scale,
    width=new_width,
    height=new_height,
    num_inference_steps=30,
    ).images

images[0].save(f"your image save path, png format is usually better than jpg or webp in terms of image quality but got much bigger")

✨ 主要特性

高質量圖像生成：基於動漫草圖生成高質量圖像，支持各種線條類型和寬度。
簡單易用：即使是繪畫零基礎的人，也能通過簡單塗鴉和添加標籤生成精美動漫插畫。
性能優越：在評估中，模型的美學得分、提示跟隨能力和圖像畸形率等指標均有顯著提升。
拓展創作邊界：幫助不懂動漫或卡通的人以簡單方式創造自己的角色，釋放創造力。

📚 詳細文檔

模型描述

屬性	詳情
開發者	xinsir
模型類型	ControlNet_SDXL
許可證	apache-2.0
微調基礎模型	stabilityai/stable-diffusion-xl-base-1.0

模型來源

論文：https://arxiv.org/abs/2302.05543

示例展示

示例 1

prompt: 1girl, breasts, solo, long hair, pointy ears, red eyes, horns, navel, sitting, cleavage, toeless legwear, hair ornament, smoking pipe, oni horns, thighhighs, detached sleeves, looking at viewer, smile, large breasts, holding smoking pipe, wide sleeves, bare shoulders, flower, barefoot, holding, nail polish, black thighhighs, jewelry, hair flower, oni, japanese clothes, fire, kiseru, very long hair, ponytail, black hair, long sleeves, bangs, red nails, closed mouth, toenails, navel cutout, cherry blossoms, water, red dress, fingernails 圖像 0

示例 2

prompt: 1girl, solo, blonde hair, weapon, sword, hair ornament, hair flower, flower, dress, holding weapon, holding sword, holding, gloves, breasts, full body, black dress, thighhighs, looking at viewer, boots, bare shoulders, bangs, medium breasts, standing, black gloves, short hair with long locks, thigh boots, sleeveless dress, elbow gloves, sidelocks, black background, black footwear, yellow eyes, sleeveless 圖像 1

示例 3

prompt: 1girl, solo, holding, white gloves, smile, purple eyes, gloves, closed mouth, balloon, holding microphone, microphone, blue flower, long hair, puffy sleeves, purple flower, blush, puffy short sleeves, short sleeves, bangs, dress, shoes, very long hair, standing, pleated dress, white background, flower, full body, blue footwear, one side up, arm up, hair bun, brown hair, food, mini crown, crown, looking at viewer, hair between eyes, heart balloon, heart, tilted headwear, single side bun, hand up 圖像 2

示例 4

prompt: tiger, 1boy, male focus, blue eyes, braid, animal ears, tiger ears, 2022, solo, smile, chinese zodiac, year of the tiger, looking at viewer, hair over one eye, weapon, holding, white tiger, grin, grey hair, polearm, arm up, white hair, animal, holding weapon, arm behind head, multicolored hair, holding polearm 圖像 3

示例 5

prompt: 1boy, male child, glasses, male focus, shorts, solo, closed eyes, bow, bowtie, smile, open mouth, red bow, jacket, red bowtie, white background, shirt, happy, black shorts, child, simple background, long sleeves, ^_^, short hair, white shirt, brown hair, black-framed eyewear, :d, facing viewer, black hair 圖像 4

示例 6

prompt: solo, 1girl, swimsuit, blue eyes, plaid headwear, bikini, blue hair, virtual youtuber, side ponytail, looking at viewer, navel, grey bik ini, ribbon, long hair, parted lips, blue nails, hat, breasts, plaid, hair ribbon, water, arm up, bracelet, star (symbol), cowboy shot, stomach, thigh strap, hair between eyes, beach, small breasts, jewelry, wet, bangs, plaid bikini, nail polish, grey headwear, blue ribbon, adapted costume, choker, ocean, bare shoulders, outdoors, beret 圖像 5

示例 7

prompt: fruit, food, no humans, food focus, cherry, simple background, english text, strawberry, signature, border, artist name, cream 圖像 6

示例 8

prompt: 1girl, solo, ball, swimsuit, bikini, mole, beachball, white bikini, breasts, hairclip, navel, looking at viewer, hair ornament, chromatic aberration, holding, holding ball, pool, cleavage, water, collarbone, mole on breast, blush, bangs, parted lips, bare shoulders, mole on thigh, bare arms, smile, large breasts, blonde hair, halterneck, hair between eyes, stomach 圖像 7

評估數據

測試數據隨機選自流行的壁紙動漫圖像（如 pixiv、nijijourney 等）。項目旨在讓每個人都能繪製動漫插畫。我們選擇了 100 張圖像，使用 waifu-tagger [https://huggingface.co/spaces/SmilingWolf/wd-tagger] 生成文本，併為每個提示生成 4 張圖像，共生成 400 張圖像。圖像分辨率應為 1024 * 1024（SDXL）或 512 * 768（SD1.5），為了公平比較，我們將 SDXL 生成的圖像調整為 512 * 768。我們計算 Laion 美學得分來衡量圖像的美觀度，計算感知相似度來衡量控制能力，發現圖像質量與指標值具有良好的一致性。

量化結果

指標	xinsir/anime-painter	lllyasviel/control_v11p_sd15_scribble
laion_aesthetic	5.95	5.86
感知相似度	0.5171	0.577

注：laion_aesthetic 得分越高越好，感知相似度越低越好。上述值是在保存為 webp 格式時計算的，保存為 png 格式時，美學值將增加 0.1 - 0.3，但相對關係保持不變。

結論

在評估中，與 lllyasviel/control_v11p_sd15_scribble 相比，本模型在動漫圖像的美學得分上表現更優。由於缺乏其他 sdxl-1.0-scribble 模型進行對比，我們未能進行相關比較。此外，由於使用了更大的基礎模型和複雜的數據增強方法，本模型在感知相似度測試中的控制能力更強，生成異常人體結構圖像的概率更低。

⚠️ 重要提示

要使用本模型生成動漫圖像，你需要從 huggingface [https://huggingface.co/models?pipeline_tag=text-to-image&sort=trending&search=blue] 或 civitai [https://civitai.com/search/models?baseModel=SDXL%201.0&sortBy=models_v8&query=anime] 選擇一個動漫 sdxl 基礎模型。這裡展示的示例基於 CounterfeitXL [https://huggingface.co/gsdf/CounterfeitXL/tree/main]，不同的基礎模型有不同的圖像風格，你也可以使用 bluepencil 或其他模型。
模型訓練使用了大量動漫圖像，這些圖像經過嚴格篩選，視覺質量與 nijijourney 或流行動漫插畫相當。模型基於 controlnet-sdxl-1.0 [https://arxiv.org/abs/2302.05543] 進行訓練，技術細節不在本報告中披露。

💡 使用建議

如果你想生成特別吸引人的圖像，建議同時使用 danbooru 標籤和自然語言進行描述。由於動漫圖像數量遠少於真實圖像，僅使用自然語言輸入（如 "a girl walk in the street"）信息有限，應更詳細地描述圖像內容，例如 "a girl, blue shirt, white hair, black eye, smile, pink flower, cherry blossoms ..."。
描述圖像時，應先使用標籤描述圖像中的元素，再用自然語言描述圖像中的場景，描述越詳細越好。否則，生成的圖像可能具有較大隨機性。