đ Small SDXL-controlnet: Canny
These are small controlnet weights trained on stabilityai/stable-diffusion-xl-base-1.0 with canny conditioning, and this checkpoint is 7x smaller than the original XL controlnet checkpoint.
đ Quick Start
These are small controlnet weights trained on stabilityai/stable-diffusion-xl-base-1.0 with canny conditioning. This checkpoint is 7x smaller than the original XL controlnet checkpoint.
You can find some example images below.
Example Images
-
Prompt: aerial view, a futuristic research complex in a bright foggy jungle, hard lighting

-
Prompt: a woman, close up, detailed, beautiful, street photography, photorealistic, detailed, Kodak ektar 100, natural, candid shot

-
Prompt: megatron in an apocalyptic world ground, runied city in the background, photorealistic

-
Prompt: a couple watching sunset, 4k photo

⨠Features
- Trained on
stabilityai/stable-diffusion-xl-base-1.0
with canny conditioning.
- 7x smaller than the original XL controlnet checkpoint.
đĻ Installation
Make sure to first install the libraries:
pip install accelerate transformers safetensors opencv-python diffusers
đģ Usage Examples
Basic Usage
from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL
from diffusers.utils import load_image
from PIL import Image
import torch
import numpy as np
import cv2
prompt = "aerial view, a futuristic research complex in a bright foggy jungle, hard lighting"
negative_prompt = "low quality, bad quality, sketches"
image = load_image("https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png")
controlnet_conditioning_scale = 0.5
controlnet = ControlNetModel.from_pretrained(
"diffusers/controlnet-canny-sdxl-1.0-small",
torch_dtype=torch.float16
)
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
controlnet=controlnet,
vae=vae,
torch_dtype=torch.float16,
)
pipe.enable_model_cpu_offload()
image = np.array(image)
image = cv2.Canny(image, 100, 200)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
image = Image.fromarray(image)
images = pipe(
prompt, negative_prompt=negative_prompt, image=image, controlnet_conditioning_scale=controlnet_conditioning_scale,
).images
images[0].save(f"hug_lab.png")

To get more details, check out the official documentation of StableDiffusionXLControlNetPipeline
.
đ§ Technical Details
Training
Our training script was built on top of the official training script that we provide here.
You can refer to this script for full discolsure.
- This checkpoint does not perform distillation. We just use a smaller ControlNet initialized from the SDXL UNet. We encourage the community to try and conduct distillation too. This resource might be of help in this regard.
- To learn more about how the ControlNet was initialized, refer to this code block.
- It does not have any attention blocks.
- The model works pretty good on most conditioning images. But for more complex conditionings, the bigger checkpoints might be better. We are still working on improving the quality of this checkpoint and looking for feedback from the community.
- We recommend playing around with the
controlnet_conditioning_scale
and guidance_scale
arguments for potentially better image generation quality.
Training Data
The model was trained on 3M images from LAION aesthetic 6 plus subset, with batch size of 256 for 50k steps with constant learning rate of 3e-5.
Compute
One 8xA100 machine
Mixed Precision
FP16
đ License
openrail++
â ī¸ Important Note
This checkpoint is experimental and there's a lot of room for improvement. We encourage the community to build on top of it, improve it, and provide us with feedback.
đĄ Usage Tip
We recommend playing around with the controlnet_conditioning_scale
and guidance_scale
arguments for potentially better image generation quality.
Property |
Details |
Model Type |
Small SDXL-controlnet: Canny |
Training Data |
3M images from LAION aesthetic 6 plus subset |
Compute |
One 8xA100 machine |
Mixed Precision |
FP16 |
License |
openrail++ |