ControlNet-openpose-sdxl-1.0開源圖像生成模型 - 精準姿態控制打造高質量圖片

首頁

Controlnet Openpose Sdxl 1.0

由xinsir開發

頂尖的ControlNet-openpose-sdxl-1.0模型，專為姿態控制生成高質量圖像而設計

圖像生成開源協議:Apache-2.0 #SDXL姿態控制 #高精度人體姿態生成 #動漫風格適配

下載量 41.49k

發布時間 : 5/13/2024

模型概述

該模型是基於Stable Diffusion XL的ControlNet擴展，專注於通過人體姿態控制生成高質量圖像，特別適合動漫和寫實風格圖像生成。

模型特點

高精度姿態控制

通過改進的Openpose檢測器實現更精確的人體姿態控制

多分辨率支持

支持從低分辨率到超高分辨率(4000px以上)的圖像生成

風格多樣性

能夠生成從寫實到動漫風格的多樣化圖像

性能優化

相比其他開源Openpose模型，在mAP指標上表現更優

模型能力

基於文本提示生成圖像

基於人體姿態控制圖像生成

高分辨率圖像生成

多種藝術風格轉換

使用案例

藝術創作

動漫角色設計

通過指定姿態生成動漫風格角色

可生成風格統一且姿態精確的動漫角色

寫實人物場景

生成符合特定人體姿態的寫實場景

生成的人物姿態自然，與場景融合度高

概念設計

角色原型設計

快速生成多種姿態的角色原型

加速設計流程，提供多樣化選擇

🚀 ControlNet-openpose-sdxl-1.0模型

ControlNet-openpose-sdxl-1.0是一款先進的模型，結合了OpenPose和ControlNet技術，可用於文本到圖像的生成任務。該模型在圖像生成方面表現出色，能夠根據輸入的文本和姿勢信息生成高質量的圖像。

示例圖片1 示例圖片2

✨ 主要特性

開發者：xinsir
模型類型：ControlNet_SDXL
許可證：Apache-2.0
微調基礎模型：stabilityai/stable-diffusion-xl-base-1.0

模型來源

論文：https://arxiv.org/abs/2302.05543

示例展示

示例圖片10 示例圖片20 示例圖片30 示例圖片40 示例圖片50 示例圖片60 示例圖片70 示例圖片80 示例圖片90 示例圖片99

示例圖片0 示例圖片1 示例圖片2 示例圖片3 示例圖片4 示例圖片5 示例圖片6 示例圖片7 示例圖片8 示例圖片9

💻 使用示例

替換默認繪製姿勢函數以獲得更好效果

感謝feiyuuu反饋問題。使用默認姿勢線時性能可能不穩定，這是因為姿勢標籤在訓練中使用了更粗的線以獲得更好的視覺效果。可以通過以下方法解決此差異：

找到controlnet_aux Python包中的util.py文件，通常路徑如下：/your anaconda3 path/envs/your env name/lib/python3.8/site-packages/controlnet_aux/open_pose/util.py 將draw_bodypose函數替換為以下代碼：

def draw_bodypose(canvas: np.ndarray, keypoints: List[Keypoint]) -> np.ndarray:
    """
    Draw keypoints and limbs representing body pose on a given canvas.

    Args:
        canvas (np.ndarray): A 3D numpy array representing the canvas (image) on which to draw the body pose.
        keypoints (List[Keypoint]): A list of Keypoint objects representing the body keypoints to be drawn.

    Returns:
        np.ndarray: A 3D numpy array representing the modified canvas with the drawn body pose.

    Note:
        The function expects the x and y coordinates of the keypoints to be normalized between 0 and 1.
    """
    H, W, C = canvas.shape

    
    if max(W, H) < 500:
        ratio = 1.0
    elif max(W, H) >= 500 and max(W, H) < 1000:
        ratio = 2.0
    elif max(W, H) >= 1000 and max(W, H) < 2000:
        ratio = 3.0
    elif max(W, H) >= 2000 and max(W, H) < 3000:
        ratio = 4.0
    elif max(W, H) >= 3000 and max(W, H) < 4000:
        ratio = 5.0
    elif max(W, H) >= 4000 and max(W, H) < 5000:
        ratio = 6.0
    else:
        ratio = 7.0

    stickwidth = 4

    limbSeq = [
        [2, 3], [2, 6], [3, 4], [4, 5], 
        [6, 7], [7, 8], [2, 9], [9, 10], 
        [10, 11], [2, 12], [12, 13], [13, 14], 
        [2, 1], [1, 15], [15, 17], [1, 16], 
        [16, 18],
    ]

    colors = [[255, 0, 0], [255, 85, 0], [255, 170, 0], [255, 255, 0], [170, 255, 0], [85, 255, 0], [0, 255, 0], \
              [0, 255, 85], [0, 255, 170], [0, 255, 255], [0, 170, 255], [0, 85, 255], [0, 0, 255], [85, 0, 255], \
              [170, 0, 255], [255, 0, 255], [255, 0, 170], [255, 0, 85]]

    for (k1_index, k2_index), color in zip(limbSeq, colors):
        keypoint1 = keypoints[k1_index - 1]
        keypoint2 = keypoints[k2_index - 1]

        if keypoint1 is None or keypoint2 is None:
            continue

        Y = np.array([keypoint1.x, keypoint2.x]) * float(W)
        X = np.array([keypoint1.y, keypoint2.y]) * float(H)
        mX = np.mean(X)
        mY = np.mean(Y)
        length = ((X[0] - X[1]) ** 2 + (Y[0] - Y[1]) ** 2) ** 0.5
        angle = math.degrees(math.atan2(X[0] - X[1], Y[0] - Y[1]))
        polygon = cv2.ellipse2Poly((int(mY), int(mX)), (int(length / 2), int(stickwidth * ratio)), int(angle), 0, 360, 1)
        cv2.fillConvexPoly(canvas, polygon, [int(float(c) * 0.6) for c in color])

    for keypoint, color in zip(keypoints, colors):
        if keypoint is None:
            continue

        x, y = keypoint.x, keypoint.y
        x = int(x * W)
        y = int(y * H)
        cv2.circle(canvas, (int(x), int(y)), int(4 * ratio), color, thickness=-1)

    return canvas

模型快速開始

使用以下代碼開始使用該模型：

from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL
from diffusers import DDIMScheduler, EulerAncestralDiscreteScheduler
from controlnet_aux import OpenposeDetector
from PIL import Image
import torch
import numpy as np
import cv2



controlnet_conditioning_scale = 1.0  
prompt = "your prompt, the longer the better, you can describe it as detail as possible"
negative_prompt = 'longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality'



eulera_scheduler = EulerAncestralDiscreteScheduler.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="scheduler")


controlnet = ControlNetModel.from_pretrained(
    "xinsir/controlnet-openpose-sdxl-1.0",
    torch_dtype=torch.float16
)

# when test with other base model, you need to change the vae also.
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)


pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    controlnet=controlnet,
    vae=vae,
    safety_checker=None,
    torch_dtype=torch.float16,
    scheduler=eulera_scheduler,
)

processor = OpenposeDetector.from_pretrained('lllyasviel/ControlNet')


controlnet_img = cv2.imread("your image path")
controlnet_img = processor(controlnet_img, hand_and_face=False, output_type='cv2')


# need to resize the image resolution to 1024 * 1024 or same bucket resolution to get the best performance
height, width, _  = controlnet_img.shape
ratio = np.sqrt(1024. * 1024. / (width * height))
new_width, new_height = int(width * ratio), int(height * ratio)
controlnet_img = cv2.resize(controlnet_img, (new_width, new_height))
controlnet_img = Image.fromarray(controlnet_img)

images = pipe(
    prompt,
    negative_prompt=negative_prompt,
    image=controlnet_img,
    controlnet_conditioning_scale=controlnet_conditioning_scale,
    width=new_width,
    height=new_height,
    num_inference_steps=30,
    ).images

images[0].save(f"your image save path, png format is usually better than jpg or webp in terms of image quality but got much bigger")