ControlNet-openpose-sdxl-1.0开源图像生成模型 - 精准姿态控制打造高质量图片

首页

Controlnet Openpose Sdxl 1.0

由 xinsir 开发

顶尖的ControlNet-openpose-sdxl-1.0模型，专为姿态控制生成高质量图像而设计

图像生成开源协议:Apache-2.0 #SDXL姿态控制 #高精度人体姿态生成 #动漫风格适配

下载量 41.49k

发布时间 : 5/13/2024

模型简介

该模型是基于Stable Diffusion XL的ControlNet扩展，专注于通过人体姿态控制生成高质量图像，特别适合动漫和写实风格图像生成。

模型特点

高精度姿态控制

通过改进的Openpose检测器实现更精确的人体姿态控制

多分辨率支持

支持从低分辨率到超高分辨率(4000px以上)的图像生成

风格多样性

能够生成从写实到动漫风格的多样化图像

性能优化

相比其他开源Openpose模型，在mAP指标上表现更优

模型能力

基于文本提示生成图像

基于人体姿态控制图像生成

高分辨率图像生成

多种艺术风格转换

使用案例

艺术创作

动漫角色设计

通过指定姿态生成动漫风格角色

可生成风格统一且姿态精确的动漫角色

写实人物场景

生成符合特定人体姿态的写实场景

生成的人物姿态自然，与场景融合度高

概念设计

角色原型设计

快速生成多种姿态的角色原型

加速设计流程，提供多样化选择

🚀 ControlNet-openpose-sdxl-1.0模型

ControlNet-openpose-sdxl-1.0是一款先进的模型，结合了OpenPose和ControlNet技术，可用于文本到图像的生成任务。该模型在图像生成方面表现出色，能够根据输入的文本和姿势信息生成高质量的图像。

示例图片1 示例图片2

✨ 主要特性

开发者：xinsir
模型类型：ControlNet_SDXL
许可证：Apache-2.0
微调基础模型：stabilityai/stable-diffusion-xl-base-1.0

模型来源

论文：https://arxiv.org/abs/2302.05543

示例展示

示例图片10 示例图片20 示例图片30 示例图片40 示例图片50 示例图片60 示例图片70 示例图片80 示例图片90 示例图片99

示例图片0 示例图片1 示例图片2 示例图片3 示例图片4 示例图片5 示例图片6 示例图片7 示例图片8 示例图片9

💻 使用示例

替换默认绘制姿势函数以获得更好效果

感谢feiyuuu反馈问题。使用默认姿势线时性能可能不稳定，这是因为姿势标签在训练中使用了更粗的线以获得更好的视觉效果。可以通过以下方法解决此差异：

找到controlnet_aux Python包中的util.py文件，通常路径如下：/your anaconda3 path/envs/your env name/lib/python3.8/site-packages/controlnet_aux/open_pose/util.py 将draw_bodypose函数替换为以下代码：

def draw_bodypose(canvas: np.ndarray, keypoints: List[Keypoint]) -> np.ndarray:
    """
    Draw keypoints and limbs representing body pose on a given canvas.

    Args:
        canvas (np.ndarray): A 3D numpy array representing the canvas (image) on which to draw the body pose.
        keypoints (List[Keypoint]): A list of Keypoint objects representing the body keypoints to be drawn.

    Returns:
        np.ndarray: A 3D numpy array representing the modified canvas with the drawn body pose.

    Note:
        The function expects the x and y coordinates of the keypoints to be normalized between 0 and 1.
    """
    H, W, C = canvas.shape

    
    if max(W, H) < 500:
        ratio = 1.0
    elif max(W, H) >= 500 and max(W, H) < 1000:
        ratio = 2.0
    elif max(W, H) >= 1000 and max(W, H) < 2000:
        ratio = 3.0
    elif max(W, H) >= 2000 and max(W, H) < 3000:
        ratio = 4.0
    elif max(W, H) >= 3000 and max(W, H) < 4000:
        ratio = 5.0
    elif max(W, H) >= 4000 and max(W, H) < 5000:
        ratio = 6.0
    else:
        ratio = 7.0

    stickwidth = 4

    limbSeq = [
        [2, 3], [2, 6], [3, 4], [4, 5], 
        [6, 7], [7, 8], [2, 9], [9, 10], 
        [10, 11], [2, 12], [12, 13], [13, 14], 
        [2, 1], [1, 15], [15, 17], [1, 16], 
        [16, 18],
    ]

    colors = [[255, 0, 0], [255, 85, 0], [255, 170, 0], [255, 255, 0], [170, 255, 0], [85, 255, 0], [0, 255, 0], \
              [0, 255, 85], [0, 255, 170], [0, 255, 255], [0, 170, 255], [0, 85, 255], [0, 0, 255], [85, 0, 255], \
              [170, 0, 255], [255, 0, 255], [255, 0, 170], [255, 0, 85]]

    for (k1_index, k2_index), color in zip(limbSeq, colors):
        keypoint1 = keypoints[k1_index - 1]
        keypoint2 = keypoints[k2_index - 1]

        if keypoint1 is None or keypoint2 is None:
            continue

        Y = np.array([keypoint1.x, keypoint2.x]) * float(W)
        X = np.array([keypoint1.y, keypoint2.y]) * float(H)
        mX = np.mean(X)
        mY = np.mean(Y)
        length = ((X[0] - X[1]) ** 2 + (Y[0] - Y[1]) ** 2) ** 0.5
        angle = math.degrees(math.atan2(X[0] - X[1], Y[0] - Y[1]))
        polygon = cv2.ellipse2Poly((int(mY), int(mX)), (int(length / 2), int(stickwidth * ratio)), int(angle), 0, 360, 1)
        cv2.fillConvexPoly(canvas, polygon, [int(float(c) * 0.6) for c in color])

    for keypoint, color in zip(keypoints, colors):
        if keypoint is None:
            continue

        x, y = keypoint.x, keypoint.y
        x = int(x * W)
        y = int(y * H)
        cv2.circle(canvas, (int(x), int(y)), int(4 * ratio), color, thickness=-1)

    return canvas

模型快速开始

使用以下代码开始使用该模型：

from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL
from diffusers import DDIMScheduler, EulerAncestralDiscreteScheduler
from controlnet_aux import OpenposeDetector
from PIL import Image
import torch
import numpy as np
import cv2



controlnet_conditioning_scale = 1.0  
prompt = "your prompt, the longer the better, you can describe it as detail as possible"
negative_prompt = 'longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality'



eulera_scheduler = EulerAncestralDiscreteScheduler.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="scheduler")


controlnet = ControlNetModel.from_pretrained(
    "xinsir/controlnet-openpose-sdxl-1.0",
    torch_dtype=torch.float16
)

# when test with other base model, you need to change the vae also.
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)


pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    controlnet=controlnet,
    vae=vae,
    safety_checker=None,
    torch_dtype=torch.float16,
    scheduler=eulera_scheduler,
)

processor = OpenposeDetector.from_pretrained('lllyasviel/ControlNet')


controlnet_img = cv2.imread("your image path")
controlnet_img = processor(controlnet_img, hand_and_face=False, output_type='cv2')


# need to resize the image resolution to 1024 * 1024 or same bucket resolution to get the best performance
height, width, _  = controlnet_img.shape
ratio = np.sqrt(1024. * 1024. / (width * height))
new_width, new_height = int(width * ratio), int(height * ratio)
controlnet_img = cv2.resize(controlnet_img, (new_width, new_height))
controlnet_img = Image.fromarray(controlnet_img)

images = pipe(
    prompt,
    negative_prompt=negative_prompt,
    image=controlnet_img,
    controlnet_conditioning_scale=controlnet_conditioning_scale,
    width=new_width,
    height=new_height,
    num_inference_steps=30,
    ).images

images[0].save(f"your image save path, png format is usually better than jpg or webp in terms of image quality but got much bigger")