Open-source controlnet-canny-sdxl-1.0-small Model - Controlled by Canny edge detection, compact and efficient!

Controlnet Canny Sdxl 1.0 Small

Developed by diffusers

A small control network trained on Stable Diffusion XL, specifically for Canny edge detection conditional control, with a size 7 times smaller than the original XL control network

Image Generation #Canny edge control #SDXL lightweight optimization #Image generation control

Downloads 567

Release Time : 8/15/2023

Model Overview

This model is a lightweight control network version of Stable Diffusion XL, focusing on precise image generation control through Canny edge detection. It retains the powerful generation capabilities of the original SDXL while significantly reducing the model size.

Model Features

Lightweight design

The model size is 7 times smaller than the original XL control network, making it more suitable for resource-limited environments

Precise edge control

Achieves high-precision image structure control through Canny edge detection

Retains SDXL features

Trained on stabilityai/stable-diffusion-xl-base-1.0, preserving the powerful generation capabilities of the original model

Experimental nature

The model is still in the experimental stage, encouraging the community to continue optimization and improvements

Model Capabilities

Edge detection-based image generation

High-precision structure control

Photorealistic image generation

Artistic style image generation

Use Cases

Creative design

Concept art creation

Generate complete artistic concept images from edge sketches

Examples show a bird's-eye view of a futuristic research base

Portrait photography enhancement

Generate high-quality portrait photos based on edge contours

Examples show close-up female portraits with photorealistic quality

Film and gaming

Character design

Generate complete character designs from simple line drawings

Examples show the image of Megatron in an apocalyptic world

Scene design

Generate complex scenes based on edge maps

Examples show a ruined city background

🚀 Small SDXL-controlnet: Canny

These are small controlnet weights trained on stabilityai/stable-diffusion-xl-base-1.0 with canny conditioning, and this checkpoint is 7x smaller than the original XL controlnet checkpoint.

🚀 Quick Start

These are small controlnet weights trained on stabilityai/stable-diffusion-xl-base-1.0 with canny conditioning. This checkpoint is 7x smaller than the original XL controlnet checkpoint. You can find some example images below.

Example Images

Prompt: aerial view, a futuristic research complex in a bright foggy jungle, hard lighting
Prompt: a woman, close up, detailed, beautiful, street photography, photorealistic, detailed, Kodak ektar 100, natural, candid shot
Prompt: megatron in an apocalyptic world ground, runied city in the background, photorealistic
Prompt: a couple watching sunset, 4k photo

✨ Features

Trained on stabilityai/stable-diffusion-xl-base-1.0 with canny conditioning.
7x smaller than the original XL controlnet checkpoint.

📦 Installation

Make sure to first install the libraries:

pip install accelerate transformers safetensors opencv-python diffusers

💻 Usage Examples

Basic Usage

from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL
from diffusers.utils import load_image
from PIL import Image
import torch
import numpy as np
import cv2

prompt = "aerial view, a futuristic research complex in a bright foggy jungle, hard lighting"
negative_prompt = "low quality, bad quality, sketches"

image = load_image("https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png")

controlnet_conditioning_scale = 0.5  # recommended for good generalization

controlnet = ControlNetModel.from_pretrained(
    "diffusers/controlnet-canny-sdxl-1.0-small",
    torch_dtype=torch.float16
)
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    controlnet=controlnet,
    vae=vae,
    torch_dtype=torch.float16,
)
pipe.enable_model_cpu_offload()

image = np.array(image)
image = cv2.Canny(image, 100, 200)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
image = Image.fromarray(image)

images = pipe(
    prompt, negative_prompt=negative_prompt, image=image, controlnet_conditioning_scale=controlnet_conditioning_scale,
).images

images[0].save(f"hug_lab.png")

hug_lab_grid)

To get more details, check out the official documentation of StableDiffusionXLControlNetPipeline.

🔧 Technical Details

Training

Our training script was built on top of the official training script that we provide here. You can refer to this script for full discolsure.

This checkpoint does not perform distillation. We just use a smaller ControlNet initialized from the SDXL UNet. We encourage the community to try and conduct distillation too. This resource might be of help in this regard.
To learn more about how the ControlNet was initialized, refer to this code block.
It does not have any attention blocks.
The model works pretty good on most conditioning images. But for more complex conditionings, the bigger checkpoints might be better. We are still working on improving the quality of this checkpoint and looking for feedback from the community.
We recommend playing around with the controlnet_conditioning_scale and guidance_scale arguments for potentially better image generation quality.

Training Data

The model was trained on 3M images from LAION aesthetic 6 plus subset, with batch size of 256 for 50k steps with constant learning rate of 3e-5.

Compute

One 8xA100 machine

Mixed Precision

FP16

📄 License

openrail++

⚠️ Important Note

This checkpoint is experimental and there's a lot of room for improvement. We encourage the community to build on top of it, improve it, and provide us with feedback.

💡 Usage Tip

We recommend playing around with the controlnet_conditioning_scale and guidance_scale arguments for potentially better image generation quality.

Property	Details
Model Type	Small SDXL-controlnet: Canny
Training Data	3M images from LAION aesthetic 6 plus subset
Compute	One 8xA100 machine
Mixed Precision	FP16
License	openrail++

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご