anime-painter开源模型 - 凭动漫草图生成极高质量图像，线条类型宽度不限

首页

Anime Painter

由 xinsir 开发

基于controlnet-scribble-sdxl-1.0的模型，能够通过动漫草图生成极高质量的图像，支持任何类型和宽度的线条。

图像生成开源协议:Apache-2.0 #动漫草图生成 #高质量图像 #涂鸦转插画

下载量 1,443

发布时间 : 5/12/2024

模型简介

该模型允许用户通过简单的草图生成高质量的动漫图像，特别适合不懂绘画的人快速创作动漫插画。

模型特点

高质量动漫图像生成

能够通过简单甚至模糊的草图生成极高质量的动漫图像。

支持多种线条类型

支持任何类型和宽度的线条，包括手绘草图。

强大的提示跟随能力

能够根据Danbooru标签和自然语言描述生成符合要求的图像。

低图像畸变率

生成异常人体结构的概率显著降低。

模型能力

动漫图像生成

草图转图像

文本到图像转换

高质量图像渲染

使用案例

动漫创作

角色设计

通过简单草图和标签生成动漫角色设计。

生成视觉吸引力极强的动漫角色图像。

场景创作

通过草图和详细描述生成动漫场景。

生成符合草图轮廓且语义合理的动漫场景。

艺术创作辅助

非专业用户创作

帮助不懂绘画的人快速创作动漫插画。

即使草图简单模糊，也能生成精美图像。

🚀 Controlnet-scribble-sdxl-1.0-anime

本模型能让每个人都成为动漫画家，即使你对绘画一无所知。它借助动漫草图生成高质量图像，支持各种线条类型和宽度，让创作变得简单又有趣。

日落图像

🚀 快速开始

Controlnet-scribble-sdxl-1.0-anime 是一款基于控制网络的模型，能依据动漫草图生成高质量图像。它支持各种类型和宽度的线条，即使草图简单模糊，也能生成精美的动漫插画。

运行代码示例

from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL
from diffusers import DDIMScheduler, EulerAncestralDiscreteScheduler
from controlnet_aux import PidiNetDetector, HEDdetector
from diffusers.utils import load_image
from huggingface_hub import HfApi
from pathlib import Path
from PIL import Image
import torch
import numpy as np
import cv2
import os


def nms(x, t, s):
    x = cv2.GaussianBlur(x.astype(np.float32), (0, 0), s)

    f1 = np.array([[0, 0, 0], [1, 1, 1], [0, 0, 0]], dtype=np.uint8)
    f2 = np.array([[0, 1, 0], [0, 1, 0], [0, 1, 0]], dtype=np.uint8)
    f3 = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]], dtype=np.uint8)
    f4 = np.array([[0, 0, 1], [0, 1, 0], [1, 0, 0]], dtype=np.uint8)

    y = np.zeros_like(x)

    for f in [f1, f2, f3, f4]:
        np.putmask(y, cv2.dilate(x, kernel=f) == x, x)

    z = np.zeros_like(y, dtype=np.uint8)
    z[y > t] = 255
    return z


controlnet_conditioning_scale = 1.0  
prompt = "your prompt, the longer the better, you can describe it as detail as possible"
negative_prompt = 'longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality'


eulera_scheduler = EulerAncestralDiscreteScheduler.from_pretrained("gsdf/CounterfeitXL", subfolder="scheduler")


controlnet = ControlNetModel.from_pretrained(
    "xinsir/anime-painter",
    torch_dtype=torch.float16
)

# when test with other base model, you need to change the vae also.
vae = AutoencoderKL.from_pretrained("gsdf/CounterfeitXL", subfolder="vae", torch_dtype=torch.float16)

pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
    "gsdf/CounterfeitXL",
    controlnet=controlnet,
    vae=vae,
    safety_checker=None,
    torch_dtype=torch.float16,
    scheduler=eulera_scheduler,
)

# you can use either hed to generate a fake scribble given an image or a sketch image totally draw by yourself
if random.random() > 0.5:
  # Method 1 
  # if you use hed, you should provide an image, the image can be real or anime, you extract its hed lines and use it as the scribbles
  # The detail about hed detect you can refer to https://github.com/lllyasviel/ControlNet/blob/main/gradio_fake_scribble2image.py
  # Below is a example using diffusers HED detector

  image_path = Image.open("your image path, the image can be real or anime, HED detector will extract its edge boundery")
  processor = HEDdetector.from_pretrained('lllyasviel/Annotators')
  controlnet_img = processor(image_path, scribble=False)
  controlnet_img.save("a hed detect path for an image")

  # following is some processing to simulate human sketch draw, different threshold can generate different width of lines
  controlnet_img = np.array(controlnet_img)
  controlnet_img = nms(controlnet_img, 127, 3)
  controlnet_img = cv2.GaussianBlur(controlnet_img, (0, 0), 3)

  # higher threshold, thiner line
  random_val = int(round(random.uniform(0.01, 0.10), 2) * 255)
  controlnet_img[controlnet_img > random_val] = 255
  controlnet_img[controlnet_img < 255] = 0
  controlnet_img = Image.fromarray(controlnet_img)

else:
  # Method 2
  # if you use a sketch image total draw by yourself
  control_path = "the sketch image you draw with some tools, like drawing board, the path you save it"
  controlnet_img = Image.open(control_path) # Note that the image must be black-white(0 or 255), like the examples we list

# must resize to 1024*1024 or same resolution bucket to get the best performance
width, height  = controlnet_img.size
ratio = np.sqrt(1024. * 1024. / (width * height))
new_width, new_height = int(width * ratio), int(height * ratio)
controlnet_img = controlnet_img.resize((new_width, new_height))

images = pipe(
    prompt,
    negative_prompt=negative_prompt,
    image=controlnet_img,
    controlnet_conditioning_scale=controlnet_conditioning_scale,
    width=new_width,
    height=new_height,
    num_inference_steps=30,
    ).images

images[0].save(f"your image save path, png format is usually better than jpg or webp in terms of image quality but got much bigger")

✨ 主要特性

高质量图像生成：基于动漫草图生成高质量图像，支持各种线条类型和宽度。
简单易用：即使是绘画零基础的人，也能通过简单涂鸦和添加标签生成精美动漫插画。
性能优越：在评估中，模型的美学得分、提示跟随能力和图像畸形率等指标均有显著提升。
拓展创作边界：帮助不懂动漫或卡通的人以简单方式创造自己的角色，释放创造力。

📚 详细文档

模型描述

属性	详情
开发者	xinsir
模型类型	ControlNet_SDXL
许可证	apache-2.0
微调基础模型	stabilityai/stable-diffusion-xl-base-1.0

模型来源

论文：https://arxiv.org/abs/2302.05543

示例展示

示例 1

prompt: 1girl, breasts, solo, long hair, pointy ears, red eyes, horns, navel, sitting, cleavage, toeless legwear, hair ornament, smoking pipe, oni horns, thighhighs, detached sleeves, looking at viewer, smile, large breasts, holding smoking pipe, wide sleeves, bare shoulders, flower, barefoot, holding, nail polish, black thighhighs, jewelry, hair flower, oni, japanese clothes, fire, kiseru, very long hair, ponytail, black hair, long sleeves, bangs, red nails, closed mouth, toenails, navel cutout, cherry blossoms, water, red dress, fingernails 图像 0

示例 2

prompt: 1girl, solo, blonde hair, weapon, sword, hair ornament, hair flower, flower, dress, holding weapon, holding sword, holding, gloves, breasts, full body, black dress, thighhighs, looking at viewer, boots, bare shoulders, bangs, medium breasts, standing, black gloves, short hair with long locks, thigh boots, sleeveless dress, elbow gloves, sidelocks, black background, black footwear, yellow eyes, sleeveless 图像 1

示例 3

prompt: 1girl, solo, holding, white gloves, smile, purple eyes, gloves, closed mouth, balloon, holding microphone, microphone, blue flower, long hair, puffy sleeves, purple flower, blush, puffy short sleeves, short sleeves, bangs, dress, shoes, very long hair, standing, pleated dress, white background, flower, full body, blue footwear, one side up, arm up, hair bun, brown hair, food, mini crown, crown, looking at viewer, hair between eyes, heart balloon, heart, tilted headwear, single side bun, hand up 图像 2

示例 4

prompt: tiger, 1boy, male focus, blue eyes, braid, animal ears, tiger ears, 2022, solo, smile, chinese zodiac, year of the tiger, looking at viewer, hair over one eye, weapon, holding, white tiger, grin, grey hair, polearm, arm up, white hair, animal, holding weapon, arm behind head, multicolored hair, holding polearm 图像 3

示例 5

prompt: 1boy, male child, glasses, male focus, shorts, solo, closed eyes, bow, bowtie, smile, open mouth, red bow, jacket, red bowtie, white background, shirt, happy, black shorts, child, simple background, long sleeves, ^_^, short hair, white shirt, brown hair, black-framed eyewear, :d, facing viewer, black hair 图像 4

示例 6

prompt: solo, 1girl, swimsuit, blue eyes, plaid headwear, bikini, blue hair, virtual youtuber, side ponytail, looking at viewer, navel, grey bik ini, ribbon, long hair, parted lips, blue nails, hat, breasts, plaid, hair ribbon, water, arm up, bracelet, star (symbol), cowboy shot, stomach, thigh strap, hair between eyes, beach, small breasts, jewelry, wet, bangs, plaid bikini, nail polish, grey headwear, blue ribbon, adapted costume, choker, ocean, bare shoulders, outdoors, beret 图像 5

示例 7

prompt: fruit, food, no humans, food focus, cherry, simple background, english text, strawberry, signature, border, artist name, cream 图像 6

示例 8

prompt: 1girl, solo, ball, swimsuit, bikini, mole, beachball, white bikini, breasts, hairclip, navel, looking at viewer, hair ornament, chromatic aberration, holding, holding ball, pool, cleavage, water, collarbone, mole on breast, blush, bangs, parted lips, bare shoulders, mole on thigh, bare arms, smile, large breasts, blonde hair, halterneck, hair between eyes, stomach 图像 7

评估数据

测试数据随机选自流行的壁纸动漫图像（如 pixiv、nijijourney 等）。项目旨在让每个人都能绘制动漫插画。我们选择了 100 张图像，使用 waifu-tagger [https://huggingface.co/spaces/SmilingWolf/wd-tagger] 生成文本，并为每个提示生成 4 张图像，共生成 400 张图像。图像分辨率应为 1024 * 1024（SDXL）或 512 * 768（SD1.5），为了公平比较，我们将 SDXL 生成的图像调整为 512 * 768。我们计算 Laion 美学得分来衡量图像的美观度，计算感知相似度来衡量控制能力，发现图像质量与指标值具有良好的一致性。

量化结果

指标	xinsir/anime-painter	lllyasviel/control_v11p_sd15_scribble
laion_aesthetic	5.95	5.86
感知相似度	0.5171	0.577

注：laion_aesthetic 得分越高越好，感知相似度越低越好。上述值是在保存为 webp 格式时计算的，保存为 png 格式时，美学值将增加 0.1 - 0.3，但相对关系保持不变。

结论

在评估中，与 lllyasviel/control_v11p_sd15_scribble 相比，本模型在动漫图像的美学得分上表现更优。由于缺乏其他 sdxl-1.0-scribble 模型进行对比，我们未能进行相关比较。此外，由于使用了更大的基础模型和复杂的数据增强方法，本模型在感知相似度测试中的控制能力更强，生成异常人体结构图像的概率更低。

⚠️ 重要提示

要使用本模型生成动漫图像，你需要从 huggingface [https://huggingface.co/models?pipeline_tag=text-to-image&sort=trending&search=blue] 或 civitai [https://civitai.com/search/models?baseModel=SDXL%201.0&sortBy=models_v8&query=anime] 选择一个动漫 sdxl 基础模型。这里展示的示例基于 CounterfeitXL [https://huggingface.co/gsdf/CounterfeitXL/tree/main]，不同的基础模型有不同的图像风格，你也可以使用 bluepencil 或其他模型。
模型训练使用了大量动漫图像，这些图像经过严格筛选，视觉质量与 nijijourney 或流行动漫插画相当。模型基于 controlnet-sdxl-1.0 [https://arxiv.org/abs/2302.05543] 进行训练，技术细节不在本报告中披露。

💡 使用建议

如果你想生成特别吸引人的图像，建议同时使用 danbooru 标签和自然语言进行描述。由于动漫图像数量远少于真实图像，仅使用自然语言输入（如 "a girl walk in the street"）信息有限，应更详细地描述图像内容，例如 "a girl, blue shirt, white hair, black eye, smile, pink flower, cherry blossoms ..."。
描述图像时，应先使用标签描述图像中的元素，再用自然语言描述图像中的场景，描述越详细越好。否则，生成的图像可能具有较大随机性。