controlnet-canny-sdxl-1.0開源圖像生成模型 - 借Canny邊緣檢測精準控制圖像輸出

首頁

Controlnet Canny Sdxl 1.0

由diffusers開發

基於Stable Diffusion XL訓練的控制網絡模型，通過Canny邊緣檢測實現精確圖像生成控制

圖像生成 #SDXL邊緣控制 #圖像精準生成 #建築場景設計

下載量 13.17k

發布時間 : 8/1/2023

模型概述

該模型是基於Stable Diffusion XL訓練的控制網絡權重，專門用於通過Canny邊緣檢測條件來控制圖像生成過程，實現更精確的圖像構圖控制。

模型特點

精確的邊緣控制

通過Canny邊緣檢測算法提取圖像輪廓，實現對生成圖像構圖的精確控制

高分辨率支持

支持1024像素以上的高分辨率圖像生成，經過兩階段訓練優化

與SDXL兼容

基於Stable Diffusion XL基礎模型，保持其高質量的圖像生成能力

模型能力

基於邊緣檢測的圖像生成

高分辨率圖像合成

精確構圖控制

使用案例

創意設計

概念藝術創作

藝術家可以通過邊緣草圖控制生成細節豐富的概念藝術作品

如示例中的未來主義科研基地圖像

產品設計

設計師可以基於簡單輪廓生成高質量產品渲染圖

攝影增強

照片級場景生成

基於邊緣信息生成照片級真實感的場景

如示例中的情侶日落場景和女性街拍圖像

膠片風格模擬

生成具有特定膠片風格(如柯達ektar100)的高質量圖像

如示例中的女性街拍和龍捲風場景

🚀 SDXL-controlnet: Canny

SDXL-controlnet: Canny是基於Canny條件，在stabilityai/stable-diffusion-xl-base-1.0上訓練得到的ControlNet權重。以下是一些示例圖片：

示例圖片

提示詞	圖片
a couple watching a romantic sunset, 4k photo
ultrarealistic shot of a furry blue bird
a woman, close up, detailed, beautiful, street photography, photorealistic, detailed, Kodak ektar 100, natural, candid shot
Cinematic, neoclassical table in the living room, cinematic, contour, lighting, highly detailed, winter, golden hour
a tornado hitting grass field, 1980's film grain. overcast, muted colors.

🚀 快速開始

安裝依賴庫

首先，你需要安裝以下庫：

pip install accelerate transformers safetensors opencv-python diffusers

運行示例代碼

安裝完成後，就可以運行以下代碼：

from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL
from diffusers.utils import load_image
from PIL import Image
import torch
import numpy as np
import cv2

prompt = "aerial view, a futuristic research complex in a bright foggy jungle, hard lighting"
negative_prompt = 'low quality, bad quality, sketches'

image = load_image("https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png")

controlnet_conditioning_scale = 0.5  # recommended for good generalization

controlnet = ControlNetModel.from_pretrained(
    "diffusers/controlnet-canny-sdxl-1.0",
    torch_dtype=torch.float16
)
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    controlnet=controlnet,
    vae=vae,
    torch_dtype=torch.float16,
)
pipe.enable_model_cpu_offload()

image = np.array(image)
image = cv2.Canny(image, 100, 200)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
image = Image.fromarray(image)

images = pipe(
    prompt, negative_prompt=negative_prompt, image=image, controlnet_conditioning_scale=controlnet_conditioning_scale,
    ).images

images[0].save(f"hug_lab.png")

images_10)

更多詳細信息，請查看StableDiffusionXLControlNetPipeline的官方文檔。

🔧 技術細節

訓練腳本

我們的訓練腳本基於此處提供的官方訓練腳本構建。

訓練數據

該檢查點首先在調整為最大最小尺寸為384的laion 6a數據集上訓練20,000步，然後在調整為最大最小尺寸為1024並過濾為僅包含最小尺寸為1024的圖像的laion 6a數據集上再訓練20,000步。我們發現進一步的高分辨率微調對於圖像質量是必要的。

計算資源

使用一臺配備8張A100顯卡的機器。

批量大小

採用數據並行，單GPU批量大小為8，總批量大小為64。

超參數

恆定學習率為1e-4，按批量大小縮放後總學習率為64e-4。

混合精度

採用fp16。

📄 許可證

本項目採用OpenRail++許可證。

信息表格

屬性	詳情
模型類型	SDXL-controlnet: Canny
基礎模型	runwayml/stable-diffusion-v1-5
訓練數據	先在調整為最大最小尺寸為384的laion 6a數據集上訓練20,000步，再在調整為最大最小尺寸為1024並過濾為僅包含最小尺寸為1024的圖像的laion 6a數據集上訓練20,000步
計算資源	一臺配備8張A100顯卡的機器
批量大小	單GPU批量大小為8，總批量大小為64
超參數	恆定學習率為1e-4，按批量大小縮放後總學習率為64e-4
混合精度	fp16
許可證	OpenRail++