controlnet-inpaint-endpoint Open-source Model - Realize Conditional Control of Image Restoration Based on Stable Diffusion

Controlnet Inpaint Endpoint

Developed by OrderAndChaos

ControlNet v1.1 is a neural network architecture based on Stable Diffusion, designed to control diffusion models through image inpainting conditions.

Image Generation OtherOpen Source License:Openrail #Image Inpainting #ControlNet Control #Stable Diffusion Integration

Downloads 56

Release Time : 5/23/2023

Model Overview

This model is the image inpainting version of ControlNet and can be used in conjunction with Stable Diffusion to control the image generation process by adding inpainting conditions.

Model Features

Image Inpainting Control

Capable of precisely controlling the area and content of image inpainting based on given images and masks.

Compatibility with Stable Diffusion

Designed specifically for use with Stable Diffusion v1.5, ensuring high-quality image generation results.

End-to-End Learning

Learns task-specific conditions through end-to-end training, enabling robust performance even on small datasets.

Model Capabilities

Image Inpainting

Image Generation

Conditional Control Generation

Use Cases

Artistic Creation

Restoring Damaged Artwork

Use this model to restore old photos or damaged artwork, recovering their original details.

High-quality restored images that preserve the original style and details.

Design

Product Design Modifications

Quickly modify specific parts of product images during the design process without redrawing.

Seamlessly modified design images that maintain overall consistency.

🚀 Controlnet - v1.1 - InPaint Version

This project focuses on the InPaint Version of Controlnet v1.1, which is a conversion of the original checkpoint into diffusers format. It can be used in combination with Stable Diffusion to control diffusion models by adding extra conditions, enabling more diverse image generation.

🚀 Quick Start

Prerequisites

You need to have a Hugging Face token (HF_TOKEN) and an API endpoint (API_ENDPOINT).

Installation

$ pip install diffusers transformers accelerate

Usage

import base64
import requests

HF_TOKEN = 'hf_xxxxxxxxxxxxx'
API_ENDPOINT = 'https://xxxxxxxxxxx.us-east-1.aws.endpoints.huggingface.cloud'

def load_image(path):
    try:
        with open(path, 'rb') as file:
            return file.read()
    except FileNotFoundError as error:
        print('Error reading image:', error)


def get_b64_image(path):
    image_buffer = load_image(path)
    if image_buffer:
        return base64.b64encode(image_buffer).decode('utf-8')


def process_images(original_image_path, mask_image_path, result_path, prompt, width, height):
    original_b64 = get_b64_image(original_image_path)
    mask_b64 = get_b64_image(mask_image_path)

    if not original_b64 or not mask_b64:
        return

    body = {
        'inputs': prompt,
        'image': original_b64,
        'mask_image': mask_b64,
        'width': width,
        'height': height
    }

    headers = {
        'Authorization': f'Bearer {HF_TOKEN}',
        'Content-Type': 'application/json',
        'Accept': 'image/png'
    }

    response = requests.post(
        API_ENDPOINT,
        json=body,
        headers=headers
    )
    blob = response.content

    save_image(blob, result_path)


def save_image(blob, file_path):
    with open(file_path, 'wb') as file:
        file.write(blob)
    print('File saved successfully!')


if __name__ == '__main__':
    original_image_path = 'images/original.png'
    mask_image_path = 'images/mask.png'
    result_path = 'images/result.png'
    process_images(original_image_path, mask_image_path, result_path, 'cyberpunk mona lisa', 512, 768)

✨ Features

Control Diffusion Models: Controlnet is a neural network structure that can control diffusion models by adding extra conditions.
Inpaint Functionality: This specific checkpoint is designed for image inpainting, allowing you to modify or complete images.
Compatibility: It can be used in combination with Stable Diffusion, such as runwayml/stable-diffusion-v1-5.

📚 Documentation

Model Details

Property	Details
Developed by	Lvmin Zhang, Maneesh Agrawala
Model Type	Diffusion-based text-to-image generation model
Language(s)	English
License	The CreativeML OpenRAIL M license is an Open RAIL M license, adapted from the work that BigScience and the RAIL Initiative are jointly carrying in the area of responsible AI licensing. See also the article about the BLOOM Open RAIL license on which our license is based.
Resources for more information	GitHub Repository, Paper
Cite as	@misc{zhang2023adding, title={Adding Conditional Control to Text-to-Image Diffusion Models}, author={Lvmin Zhang and Maneesh Agrawala}, year={2023}, eprint={2302.05543}, archivePrefix={arXiv}, primaryClass={cs.CV} }

Introduction

Controlnet was proposed in Adding Conditional Control to Text-to-Image Diffusion Models by Lvmin Zhang and Maneesh Agrawala. The abstract states that ControlNet is a neural network structure to control pretrained large diffusion models to support additional input conditions. It can learn task-specific conditions in an end-to-end way, and the learning is robust even with a small training dataset. Training a ControlNet is as fast as fine-tuning a diffusion model, and it can be trained on personal devices or scaled to large amounts of data with powerful computation clusters.

Example

It is recommended to use this checkpoint with Stable Diffusion v1-5 as it has been trained on it. Experimentally, it can also be used with other diffusion models such as dreamboothed stable diffusion.

import torch
import os
from diffusers.utils import load_image
from PIL import Image
import numpy as np
from diffusers import (
    ControlNetModel,
    StableDiffusionControlNetPipeline,
    UniPCMultistepScheduler,
)
checkpoint = "lllyasviel/control_v11p_sd15_inpaint"
original_image = load_image(
    "https://huggingface.co/lllyasviel/control_v11p_sd15_inpaint/resolve/main/images/original.png"
)
mask_image = load_image(
    "https://huggingface.co/lllyasviel/control_v11p_sd15_inpaint/resolve/main/images/mask.png"
)

def make_inpaint_condition(image, image_mask):
    image = np.array(image.convert("RGB")).astype(np.float32) / 255.0
    image_mask = np.array(image_mask.convert("L"))
    assert image.shape[0:1] == image_mask.shape[0:1], "image and image_mask must have the same image size"
    image[image_mask < 128] = -1.0 # set as masked pixel 
    image = np.expand_dims(image, 0).transpose(0, 3, 1, 2)
    image = torch.from_numpy(image)
    return image

control_image = make_inpaint_condition(original_image, mask_image)
prompt = "best quality"
negative_prompt="lowres, bad anatomy, bad hands, cropped, worst quality"
controlnet = ControlNetModel.from_pretrained(checkpoint, torch_dtype=torch.float16)
pipe = StableDiffusionControlNetPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16
)
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
pipe.enable_model_cpu_offload()
generator = torch.manual_seed(2)
image = pipe(prompt, negative_prompt=negative_prompt, num_inference_steps=30, 
             generator=generator, image=control_image).images[0]
image.save('images/output.png')

original mask inpaint_output

Other released checkpoints v1-1

The authors released 14 different checkpoints, each trained with Stable Diffusion v1-5 on a different type of conditioning:

Model Name	Control Image Overview	Condition Image	Control Image Example	Generated Image Example
lllyasviel/control_v11p_sd15_canny	Trained with canny edge detection	A monochrome image with white edges on a black background.
lllyasviel/control_v11e_sd15_ip2p	Trained with pixel to pixel instruction	No condition.
lllyasviel/control_v11p_sd15_inpaint	Trained with image inpainting	No condition.
lllyasviel/control_v11p_sd15_mlsd	Trained with multi-level line segment detection	An image with annotated line segments.
lllyasviel/control_v11f1p_sd15_depth	Trained with depth estimation	An image with depth information, usually represented as a grayscale image.
lllyasviel/control_v11p_sd15_normalbae	Trained with surface normal estimation	An image with surface normal information, usually represented as a color-coded image.
lllyasviel/control_v11p_sd15_seg	Trained with image segmentation	An image with segmented regions, usually represented as a color-coded image.
lllyasviel/control_v11p_sd15_lineart	Trained with line art generation	An image with line art, usually black lines on a white background.
lllyasviel/control_v11p_sd15s2_lineart_anime	Trained with anime line art generation	...	...	...

🔧 Technical Details

ControlNet is a neural network structure to control diffusion models by adding extra conditions. The original checkpoint was released in lllyasviel/ControlNet-v1-1 by Lvmin Zhang. This conversion into diffusers format allows for easier integration with Stable Diffusion.

For more details, please also have a look at the 🧨 Diffusers docs.

📄 License

The project is licensed under The CreativeML OpenRAIL M license, which is an Open RAIL M license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご