ControlNet v1.1 Open-Source AI Model - Free Image Generation and Super Resolution with Tile Image Condition Support

Control V11f1e Sd15 Tile

Developed by lllyasviel

ControlNet v1.1 is a neural network structure that controls pre-trained large diffusion models by adding additional conditions. It is particularly suitable for image generation and super-resolution tasks based on tile image conditions.

Image Generation OtherOpen Source License:Openrail #Image super-resolution #Detail enhancement #Block processing

Downloads 14.39k

Release Time : 5/4/2023

Model Overview

This model is trained based on Stable Diffusion v1-5 and can generate high-quality images according to the input tile image conditions. It is suitable for scenarios such as image enhancement and detail generation.

Model Features

Tile image condition control

It can generate high-quality detailed images of the same size according to the input tile image conditions, which is similar to a super-resolution model but has more extensive functions.

Efficient training

It can maintain robust learning even on small datasets (<50,000 samples), and the training speed is comparable to that of fine-tuning diffusion models.

Strong compatibility

It can be used in conjunction with Stable Diffusion v1-5 and other diffusion models (such as dreamboothed stable diffusion).

Model Capabilities

Image super-resolution

Detail enhancement

Conditional image generation

Image-to-image conversion

Use Cases

Image processing

Image detail enhancement

Perform detail enhancement and super-resolution processing on low-resolution or blurred images

Generate high-quality images of the same size as the input image but containing richer details

Artistic creation

Generate artistic style images based on tile image conditions

Add artistic style details while maintaining the structure of the input image

🚀 Controlnet - v1.1 - Tile Version

ControlNet v1.1 is a neural network structure that adds extra conditions to control diffusion models. It can be used in combination with Stable Diffusion to generate high - quality images, offering more flexibility and control in image generation.

🚀 Quick Start

Installation

First, you need to install the diffusers and related packages:

$ pip install diffusers transformers accelerate

Usage

Here is a code example to show you how to use this model:

import torch
from PIL import Image
from diffusers import ControlNetModel, DiffusionPipeline
from diffusers.utils import load_image

def resize_for_condition_image(input_image: Image, resolution: int):
    input_image = input_image.convert("RGB")
    W, H = input_image.size
    k = float(resolution) / min(H, W)
    H *= k
    W *= k
    H = int(round(H / 64.0)) * 64
    W = int(round(W / 64.0)) * 64
    img = input_image.resize((W, H), resample=Image.LANCZOS)
    return img

controlnet = ControlNetModel.from_pretrained('lllyasviel/control_v11f1e_sd15_tile', 
                                             torch_dtype=torch.float16)
pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5",
                                         custom_pipeline="stable_diffusion_controlnet_img2img",
                                         controlnet=controlnet,
                                         torch_dtype=torch.float16).to('cuda')
pipe.enable_xformers_memory_efficient_attention()

source_image = load_image('https://huggingface.co/lllyasviel/control_v11f1e_sd15_tile/resolve/main/images/original.png')

condition_image = resize_for_condition_image(source_image, 1024)
image = pipe(prompt="best quality", 
             negative_prompt="blur, lowres, bad anatomy, bad hands, cropped, worst quality", 
             image=condition_image, 
             controlnet_conditioning_image=condition_image, 
             width=condition_image.size[0],
             height=condition_image.size[1],
             strength=1.0,
             generator=torch.manual_seed(0),
             num_inference_steps=32,
            ).images[0]

image.save('output.png')

The following are the original and generated images: tile_output

✨ Features

Extra Conditional Control: ControlNet adds extra conditions to diffusion models, enabling more precise control over image generation.
Combination with Stable Diffusion: It can be used in combination with Stable Diffusion, such as runwayml/stable - diffusion - v1 - 5, to generate high - quality images.
Multiple Conditioning Types: There are 14 different checkpoints, each trained with a different type of conditioning, providing a wide range of application scenarios.

📦 Installation

As mentioned in the Quick Start section, you can install the necessary packages using the following command:

$ pip install diffusers transformers accelerate

💻 Usage Examples

Basic Usage

The code example in the Quick Start section shows the basic usage of this model. You can adjust parameters such as prompt, negative_prompt, and strength to generate different images.

Advanced Usage

You can experiment with different diffusion models (e.g., dreamboothed stable diffusion) and different conditioning images to explore more possibilities of this model.

📚 Documentation

Model Details

Property	Details
Developed by	Lvmin Zhang, Maneesh Agrawala
Model Type	Diffusion - based text - to - image generation model
Language(s)	English
License	[The CreativeML OpenRAIL M license](https://huggingface.co/spaces/CompVis/stable - diffusion - license) is an [Open RAIL M license](https://www.licenses.ai/blog/2022/8/18/naming - convention - of - responsible - ai - licenses), adapted from the work that BigScience and the RAIL Initiative are jointly carrying in the area of responsible AI licensing. See also [the article about the BLOOM Open RAIL license](https://bigscience.huggingface.co/blog/the - bigscience - rail - license) on which our license is based.
Resources for more information	GitHub Repository, Paper
Cite as	@misc{zhang2023adding, title={Adding Conditional Control to Text - to - Image Diffusion Models}, author={Lvmin Zhang and Maneesh Agrawala}, year={2023}, eprint={2302.05543}, archivePrefix={arXiv}, primaryClass={cs.CV} }

Introduction

Controlnet was proposed in Adding Conditional Control to Text - to - Image Diffusion Models by Lvmin Zhang and Maneesh Agrawala. The abstract indicates that ControlNet can control pretrained large diffusion models to support additional input conditions. It can learn task - specific conditions in an end - to - end way, and the training process is fast and can be performed on personal devices.

Other released checkpoints v1 - 1

The authors released 14 different checkpoints, each trained with [Stable Diffusion v1 - 5](https://huggingface.co/runwayml/stable - diffusion - v1 - 5) on a different type of conditioning:

Model Name	Control Image Overview	Condition Image
lllyasviel/control_v11p_sd15_canny	Trained with canny edge detection	A monochrome image with white edges on a black background.
lllyasviel/control_v11e_sd15_ip2p	Trained with pixel to pixel instruction	No condition .
lllyasviel/control_v11p_sd15_inpaint	Trained with image inpainting	No condition.
lllyasviel/control_v11p_sd15_mlsd	Trained with multi - level line segment detection	An image with annotated line segments.
lllyasviel/control_v11f1p_sd15_depth	Trained with depth estimation	An image with depth information, usually represented as a grayscale image.
lllyasviel/control_v11p_sd15_normalbae	Trained with surface normal estimation	An image with surface normal information, usually represented as a color - coded image.
lllyasviel/control_v11p_sd15_seg	Trained with image segmentation	An image with segmented regions, usually represented as a color - coded image.
lllyasviel/control_v11p_sd15_lineart	Trained with line art generation	An image with line art, usually black lines on a white background.
lllyasviel/control_v11p_sd15s2_lineart_anime	Trained with anime line art generation	An image with anime - style line art.
lllyasviel/control_v11p_sd15_openpose	Trained with human pose estimation	An image with human poses, usually represented as a set of keypoints or skeletons.
lllyasviel/control_v11p_sd15_scribble	Trained with scribble - based image generation	An image with scribbles, usually random or user - drawn strokes.
lllyasviel/control_v11p_sd15_softedge	Trained with soft edge image generation	An image with soft edges, usually to create a more painterly or artistic effect.
lllyasviel/control_v11e_sd15_shuffle	Trained with image shuffling	An image with shuffled patches or regions.
lllyasviel/control_v11f1e_sd15_tile	Trained with image tiling	A blurry image or part of an image .

More information

For more information, please also have a look at the Diffusers ControlNet Blog Post and the official docs.

🔧 Technical Details

ControlNet is a neural network structure that adds extra conditions to diffusion models. It can learn task - specific conditions in an end - to - end way, and the training process is fast. The model can be trained on personal devices with a small training dataset, or it can scale to large amounts of data when powerful computation clusters are available.

📄 License

This model is under [The CreativeML OpenRAIL M license](https://huggingface.co/spaces/CompVis/stable - diffusion - license), which is an [Open RAIL M license](https://www.licenses.ai/blog/2022/8/18/naming - convention - of - responsible - ai - licenses).

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご