BRIA-2.3-ControlNet-Inpainting開源圖像修復模型

首頁

BRIA 2.3 ControlNet Inpainting

由briaai開發

基於商業級授權數據集訓練的智能圖像修復模型，提供法律責任保障

圖像生成開源協議:其他 #商業授權修復 #法律責任保障 #ControlNet架構

下載量 25

發布時間 : 6/13/2024

模型概述

BRIA 2.3是一款基於用戶文本提示對圖像遮蔽區域進行智能填充的修復模型，支持對象移除、替換、添加、修改及圖像擴展功能

模型特點

商業法律保障

訓練數據完全合規，提供版權侵權、隱私侵犯及有害內容的完整法律責任覆蓋

極速修復

結合FAST-LORA技術，在A10 GPU上僅需1.6秒完成修復

多場景應用

支持對象移除、替換、添加、修改及圖像擴展等多種編輯需求

模型能力

圖像修復

對象移除

內容替換

圖像擴展

使用案例

圖像編輯

移除不需要的對象

從照片中移除不需要的人物或物體

生成自然無痕的修復效果

內容替換

替換圖像中的特定元素

保持圖像整體風格一致

創意設計

圖像擴展

擴展圖像邊界或填充缺失區域

生成符合原始圖像風格的擴展內容

🚀 BRIA 2.3 ControlNet Inpainting Fast

BRIA 2.3是一款專為商業用途打造的圖像修復模型，它基於大規模多源商業級許可數據集進行訓練，在保證圖像質量的同時，為商業使用提供了安全保障。該模型能有效避免版權和隱私侵權問題，以及有害內容的產生。它可以根據用戶提供的文本提示，填充圖像中的掩碼區域，適用於圖像中物體的移除、替換、添加和修改等多種場景，還具備圖像擴展能力。

🚀 快速開始

加入我們的 Discord社區，獲取更多信息、教程和工具，還能與其他用戶交流！

✨ 主要特性

全新特性：BRIA 2.3 ControlNet Inpainting 可以基於 BRIA 2.3 Text-to-Image 使用 Fast-LORA，實現極快的圖像修復，在 A10 GPU 上僅需 1.6 秒。
高質量與安全性：僅在最大的多源商業級許可數據集上進行訓練，保證最佳質量，同時確保商業使用安全。
法律責任保障：為版權和隱私侵權以及有害內容緩解提供全面的法律責任覆蓋。
廣泛應用場景：可用於圖像中物體的移除、替換、添加、修改以及圖像擴展等。

📦 安裝指南

下載

from huggingface_hub import hf_hub_download
import os

try:
    local_dir = os.path.dirname(__file__)
except:
    local_dir = '.'

hf_hub_download(repo_id="briaai/BRIA-2.3-ControlNet-Inpainting", filename='controlnet.py', local_dir=local_dir)
hf_hub_download(repo_id="briaai/BRIA-2.3-ControlNet-Inpainting", filename='config.json', local_dir=local_dir)
hf_hub_download(repo_id="briaai/BRIA-2.3-ControlNet-Inpainting", filename='image_processor.py', local_dir=local_dir)
hf_hub_download(repo_id="briaai/BRIA-2.3-ControlNet-Inpainting", filename='pipeline_controlnet_sd_xl.py', local_dir=local_dir)

💻 使用示例

基礎用法

from diffusers import (
    AutoencoderKL,
    LCMScheduler,
)
from pipeline_controlnet_sd_xl import StableDiffusionXLControlNetPipeline
from controlnet import ControlNetModel
import torch
import numpy as np
from PIL import Image
import requests
import PIL
from io import BytesIO
from torchvision import transforms
import os

def resize_image_to_retain_ratio(image):
    pixel_number = 1024*1024
    granularity_val = 8
    ratio = image.size[0] / image.size[1]
    width = int((pixel_number * ratio) ** 0.5)
    width = width - (width % granularity_val)
    height = int(pixel_number / width)
    height = height - (height % granularity_val)

    image = image.resize((width, height))
    return image

def download_image(url):
    response = requests.get(url)
    return PIL.Image.open(BytesIO(response.content)).convert("RGB")

def get_masked_image(image, image_mask, width, height):
    image_mask = image_mask # inpaint area is white
    image_mask = image_mask.resize((width, height)) # object to remove is white (1)
    image_mask_pil = image_mask
    image = np.array(image.convert("RGB")).astype(np.float32) / 255.0
    image_mask = np.array(image_mask_pil.convert("L")).astype(np.float32) / 255.0
    assert image.shape[0:1] == image_mask.shape[0:1], "image and image_mask must have the same image size"
    masked_image_to_present = image.copy()
    masked_image_to_present[image_mask > 0.5] = (0.5,0.5,0.5)  # set as masked pixel
    image[image_mask > 0.5] = 0.5  # set as masked pixel - s.t. will be grey 
    image = Image.fromarray((image * 255.0).astype(np.uint8))
    masked_image_to_present = Image.fromarray((masked_image_to_present * 255.0).astype(np.uint8))
    return image, image_mask_pil, masked_image_to_present

image_transforms = transforms.Compose(
    [
        transforms.ToTensor(),
    ]
)

default_negative_prompt = "Logo,Watermark,Text,Ugly,Morbid,Extra fingers,Poorly drawn hands,Mutation,Blurry,Extra limbs,Gross proportions,Missing arms,Mutated hands,Long neck,Duplicate,Mutilated,Mutilated hands,Poorly drawn face,Deformed,Bad anatomy,Cloned face,Malformed limbs,Missing legs,Too many fingers"

img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"

init_image = download_image(img_url).resize((1024, 1024))
mask_image = download_image(mask_url).resize((1024, 1024))

init_image = resize_image_to_retain_ratio(init_image)
width, height = init_image.size

mask_image = mask_image.convert("L").resize(init_image.size)

width, height = init_image.size

# Load, init model    
controlnet = ControlNetModel().from_pretrained("briaai/BRIA-2.3-ControlNet-Inpainting", torch_dtype=torch.float16)
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained("briaai/BRIA-2.3", controlnet=controlnet.to(dtype=torch.float16), torch_dtype=torch.float16, vae=vae) #force_zeros_for_empty_prompt=False, # vae=vae)

pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
pipe.load_lora_weights("briaai/BRIA-2.3-FAST-LORA")
pipe.fuse_lora()
pipe = pipe.to(device="cuda")

# pipe.enable_xformers_memory_efficient_attention()

generator = torch.Generator(device="cuda").manual_seed(123456)

vae = pipe.vae

masked_image, image_mask, masked_image_to_present = get_masked_image(init_image, mask_image, width, height)

masked_image_tensor = image_transforms(masked_image)
masked_image_tensor = (masked_image_tensor - 0.5) / 0.5

masked_image_tensor = masked_image_tensor.unsqueeze(0).to(device="cuda")
control_latents = vae.encode(  
        masked_image_tensor[:, :3, :, :].to(vae.dtype)
    ).latent_dist.sample()   
control_latents = control_latents * vae.config.scaling_factor 

image_mask = np.array(image_mask)[:,:]
mask_tensor = torch.tensor(image_mask, dtype=torch.float32)[None, ...]
# binarize the mask
mask_tensor = torch.where(mask_tensor > 128.0, 255.0, 0)       

mask_tensor = mask_tensor / 255.0

mask_tensor = mask_tensor.to(device="cuda")
mask_resized = torch.nn.functional.interpolate(mask_tensor[None, ...], size=(control_latents.shape[2], control_latents.shape[3]), mode='nearest')

masked_image = torch.cat([control_latents, mask_resized], dim=1)

prompt = ""

gen_img = pipe(negative_prompt=default_negative_prompt, prompt=prompt, 
            controlnet_conditioning_scale=1.0, 
            num_inference_steps=12, 
            height=height, width=width, 
            image = masked_image, # control image
            init_image = init_image,     
            mask_image = mask_tensor,
            guidance_scale = 1.2,
            generator=generator).images[0]

display(gen_img)

📚 詳細文檔

模型描述

屬性	詳情
開發者	BRIA AI
模型類型	潛在擴散圖像到圖像模型
許可證	bria-2.3 修復許可條款和條件，使用和訪問該模型需要購買許可證。
模型描述	BRIA 2.3 修復模型僅在專業級許可數據集上進行訓練，專為商業用途設計，並提供全面的法律責任覆蓋。
更多信息資源	BRIA AI