Controlnet Tile Sdxl 1.0
基於Stable Diffusion XL的ControlNet Tile模型,專注於圖像細節修復、變體生成和超分辨率處理
Downloads 20.64k
Release Time : 6/26/2024
Model Overview
該模型通過Tile控制網絡實現對圖像的精細控制,主要用於圖像去模糊、細節修復、風格變體生成以及超分辨率放大等任務
Model Features
圖像細節修復
能夠對模糊或低質量圖像進行細節修復和增強
風格變體生成
可在保留原圖主要內容的同時生成不同風格的變體
超分辨率放大
支持任意比例的超分辨率放大,最高可達3倍
寬高比自適應
支持任意寬高比的圖像處理
Model Capabilities
圖像去模糊
細節增強
風格轉換
超分辨率放大
圖像變體生成
Use Cases
圖像修復
模糊圖像修復
對模糊或低質量圖像進行細節修復
顯著提升圖像清晰度和細節表現
創意設計
圖像風格變體
生成保留原圖內容但風格不同的變體
獲得多樣化風格的藝術作品
圖像增強
超分辨率放大
對低分辨率圖像進行高質量放大
3倍放大後仍保持良好細節
🚀 ControlNet Tile SDXL
ControlNet Tile SDXL 是一個用於文本到圖像生成的項目,它能夠實現圖像去模糊、圖像變體生成以及圖像超分辨率等功能,為圖像生成和處理提供了多樣化的解決方案。
🚀 快速開始
本項目可實現圖像去模糊、圖像變體生成以及圖像超分辨率等功能,以下是相關示例和使用代碼。
✨ 主要特性
- 圖像去模糊:可對模糊圖像進行去模糊處理,恢復圖像細節。
- 圖像變體生成:類似於 Midjourney,能根據輸入圖像生成不同變體的圖像。
- 圖像超分辨率:支持任意寬高比和任意倍數的圖像放大,如 3 * 3 倍放大。
📦 安裝指南
文檔未提供具體安裝步驟,可參考代碼引用中的相關鏈接:
- https://huggingface.co/TTPlanet/TTPLanet_SDXL_Controlnet_Tile_Realistic/blob/main/TTP_tile_preprocessor_v5.py
- https://github.com/lllyasviel/ControlNet-v1-1-nightly/blob/main/gradio_tile.py
💻 使用示例
基礎用法
圖像去模糊(Tile blur)
from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL
from diffusers import DDIMScheduler, EulerAncestralDiscreteScheduler
from PIL import Image
from guided_filter import FastGuidedFilter # I have upload this file in this repo
import torch
import numpy as np
import cv2
def resize_image_control(control_image, resolution):
HH, WW, _ = control_image.shape
crop_h = random.randint(0, HH - resolution[1])
crop_w = random.randint(0, WW - resolution[0])
crop_image = control_image[crop_h:crop_h+resolution[1], crop_w:crop_w+resolution[0], :]
return crop_image, crop_w, crop_h
def apply_gaussian_blur(image_np, ksize=5, sigmaX=1.0):
if ksize % 2 == 0:
ksize += 1 # ksize must be odd
blurred_image = cv2.GaussianBlur(image_np, (ksize, ksize), sigmaX=sigmaX)
return blurred_image
def apply_guided_filter(image_np, radius, eps, scale):
filter = FastGuidedFilter(image_np, radius, eps, scale)
return filter.filter(image_np)
controlnet_conditioning_scale = 1.0
prompt = "your prompt, the longer the better, you can describe it as detail as possible"
negative_prompt = 'longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality'
eulera_scheduler = EulerAncestralDiscreteScheduler.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="scheduler")
controlnet = ControlNetModel.from_pretrained(
"xinsir/controlnet-tile-sdxl-1.0",
torch_dtype=torch.float16
)
# when test with other base model, you need to change the vae also.
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
controlnet=controlnet,
vae=vae,
safety_checker=None,
torch_dtype=torch.float16,
scheduler=eulera_scheduler,
)
controlnet_img = cv2.imread("your original image path")
height, width, _ = controlnet_img.shape
ratio = np.sqrt(1024. * 1024. / (width * height))
W, H = int(width * ratio), int(height * ratio)
crop_w, crop_h = 0, 0
controlnet_img = cv2.resize(controlnet_img, (W, H))
blur_strength = random.sample([i / 10. for i in range(10, 201, 2)], k=1)[0]
radius = random.sample([i for i in range(1, 40, 2)], k=1)[0]
eps = random.sample([i / 1000. for i in range(1, 101, 2)], k=1)[0]
scale_factor = random.sample([i / 10. for i in range(10, 181, 5)], k=1)[0]
if random.random() > 0.5:
controlnet_img = apply_gaussian_blur(controlnet_img, ksize=int(blur_strength), sigmaX=blur_strength / 2)
if random.random() > 0.5:
# Apply Guided Filter
controlnet_img = apply_guided_filter(controlnet_img, radius, eps, scale_factor)
# Resize image
controlnet_img = cv2.resize(controlnet_img, (int(W / scale_factor), int(H / scale_factor)), interpolation=cv2.INTER_AREA)
controlnet_img = cv2.resize(controlnet_img, (W, H), interpolation=cv2.INTER_CUBIC)
controlnet_img = cv2.cvtColor(controlnet_img, cv2.COLOR_BGR2RGB)
controlnet_img = Image.fromarray(controlnet_img)
# need to resize the image resolution to 1024 * 1024 or same bucket resolution to get the best performance
images = pipe(
prompt,
negative_prompt=negative_prompt,
image=controlnet_img,
controlnet_conditioning_scale=controlnet_conditioning_scale,
width=new_width,
height=new_height,
num_inference_steps=30,
).images
images[0].save(f"your image save path, png format is usually better than jpg or webp in terms of image quality but got much bigger")
圖像變體生成(Tile var)
from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL
from diffusers import DDIMScheduler, EulerAncestralDiscreteScheduler
from PIL import Image
import torch
import numpy as np
import cv2
controlnet_conditioning_scale = 1.0
prompt = "your prompt, the longer the better, you can describe it as detail as possible"
negative_prompt = 'longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality'
eulera_scheduler = EulerAncestralDiscreteScheduler.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="scheduler")
controlnet = ControlNetModel.from_pretrained(
"xinsir/controlnet-tile-sdxl-1.0",
torch_dtype=torch.float16
)
# when test with other base model, you need to change the vae also.
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
controlnet=controlnet,
vae=vae,
safety_checker=None,
torch_dtype=torch.float16,
scheduler=eulera_scheduler,
)
controlnet_img = cv2.imread("your original image path")
height, width, _ = controlnet_img.shape
ratio = np.sqrt(1024. * 1024. / (width * height))
W, H = int(width * ratio), int(height * ratio)
crop_w, crop_h = 0, 0
controlnet_img = cv2.resize(controlnet_img, (W, H))
controlnet_img = cv2.cvtColor(controlnet_img, cv2.COLOR_BGR2RGB)
controlnet_img = Image.fromarray(controlnet_img)
# need to resize the image resolution to 1024 * 1024 or same bucket resolution to get the best performance
images = pipe(
prompt,
negative_prompt=negative_prompt,
image=controlnet_img,
controlnet_conditioning_scale=controlnet_conditioning_scale,
width=new_width,
height=new_height,
num_inference_steps=30,
).images
images[0].save(f"your image save path, png format is usually better than jpg or webp in terms of image quality but got much bigger")
圖像超分辨率(Tile super)
from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL
from diffusers import DDIMScheduler, EulerAncestralDiscreteScheduler
from PIL import Image
import torch
import numpy as np
import cv2
controlnet_conditioning_scale = 1.0
prompt = "your prompt, the longer the better, you can describe it as detail as possible"
negative_prompt = 'longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality'
eulera_scheduler = EulerAncestralDiscreteScheduler.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="scheduler")
controlnet = ControlNetModel.from_pretrained(
"xinsir/controlnet-tile-sdxl-1.0",
torch_dtype=torch.float16
)
# when test with other base model, you need to change the vae also.
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
controlnet=controlnet,
vae=vae,
safety_checker=None,
torch_dtype=torch.float16,
scheduler=eulera_scheduler,
)
controlnet_img = cv2.imread("your original image path")
height, width, _ = controlnet_img.shape
ratio = np.sqrt(1024. * 1024. / (width * height))
W, H = int(width * ratio) // 48 * 48, int(height * ratio) // 48 * 48
controlnet_img = cv2.resize(controlnet_img, (W, H))
controlnet_img = cv2.cvtColor(controlnet_img, cv2.COLOR_BGR2RGB)
controlnet_img = Image.fromarray(controlnet_img)
# need to resize the image resolution to 1024 * 1024 or same bucket resolution to get the best performance
target_width = W // 3
target_height = H // 3
for i in range(3): # 兩行
for j in range(3): # 兩列
left = j * target_width
top = i * target_height
right = left + target_width
bottom = top + target_height
# 根據計算的邊界裁剪圖像
cropped_image = controlnet_img.crop((left, top, right, bottom))
cropped_image = cropped_image.resize((W, H))
images.append(cropped_image)
seed = random.randint(0, 2147483647)
generator = torch.Generator('cuda').manual_seed(seed)
result_images = []
for sub_img in images:
new_width, new_height = W, H
out = pipe(prompt=[prompt]*1,
image=sub_img,
control_image=sub_img,
negative_prompt=[negative_prompt]*1,
generator=generator,
width=new_width,
height=new_height,
num_inference_steps=30,
crops_coords_top_left=(W, H),
target_size=(W, H),
original_size=(W * 2, H * 2),
)
result_images.append(out.images[0])
new_im = Image.new('RGB', (new_width*3, new_height*3))
# 拼接圖片到新的圖像上
new_im.paste(result_images[0], (0, 0))
new_im.paste(result_images[1], (new_width, 0))
new_im.paste(result_images[2], (new_width * 2, 0))
new_im.paste(result_images[3], (0, new_height))
new_im.paste(result_images[4], (new_width, new_height))
new_im.paste(result_images[5], (new_width * 2, new_height))
new_im.paste(result_images[6], (0, new_height * 2))
new_im.paste(result_images[7], (new_width, new_height * 2))
new_im.paste(result_images[8], (new_width * 2, new_height * 2))
new_im.save(f"your image save path, png format is usually better than jpg or webp in terms of image quality but got much bigger")
高級用法
文檔未提供高級用法相關內容。
📄 許可證
本項目採用 Apache-2.0 許可證。
Stable Diffusion V1 5
Openrail
穩定擴散是一種潛在的文本到圖像擴散模型,能夠根據任何文本輸入生成逼真的圖像。
圖像生成
S
stable-diffusion-v1-5
3.7M
518
Stable Diffusion Inpainting
Openrail
基於穩定擴散的文本到圖像生成模型,具備圖像修復能力
圖像生成
S
stable-diffusion-v1-5
3.3M
56
Stable Diffusion Xl Base 1.0
SDXL 1.0是基於擴散的文本生成圖像模型,採用專家集成的潛在擴散流程,支持高分辨率圖像生成
圖像生成
S
stabilityai
2.4M
6,545
Stable Diffusion V1 4
Openrail
穩定擴散是一種潛在文本到圖像擴散模型,能夠根據任意文本輸入生成逼真圖像。
圖像生成
S
CompVis
1.7M
6,778
Stable Diffusion Xl Refiner 1.0
SD-XL 1.0優化器模型是Stability AI開發的圖像生成模型,專為提升SDXL基礎模型生成的圖像質量而設計,特別擅長最終去噪步驟處理。
圖像生成
S
stabilityai
1.1M
1,882
Stable Diffusion 2 1
基於擴散的文本生成圖像模型,支持通過文本提示生成和修改圖像
圖像生成
S
stabilityai
948.75k
3,966
Stable Diffusion Xl 1.0 Inpainting 0.1
基於Stable Diffusion XL的潛在文本到圖像擴散模型,具備通過遮罩進行圖像修復的功能
圖像生成
S
diffusers
673.14k
334
Stable Diffusion 2 Base
基於擴散的文生圖模型,可根據文本提示生成高質量圖像
圖像生成
S
stabilityai
613.60k
349
Playground V2.5 1024px Aesthetic
Other
開源文生圖模型,能生成1024x1024分辨率及多種縱橫比的美學圖像,在美學質量上處於開源領域領先地位。
圖像生成
P
playgroundai
554.94k
723
Sd Turbo
SD-Turbo是一款高速文本生成圖像模型,僅需單次網絡推理即可根據文本提示生成逼真圖像。該模型作為研究原型發佈,旨在探索小型蒸餾文本生成圖像模型。
圖像生成
S
stabilityai
502.82k
380
Featured Recommended AI Models
Llama 3 Typhoon V1.5x 8b Instruct
專為泰語設計的80億參數指令模型,性能媲美GPT-3.5-turbo,優化了應用場景、檢索增強生成、受限生成和推理任務
大型語言模型
Transformers Supports Multiple Languages

L
scb10x
3,269
16
Cadet Tiny
Openrail
Cadet-Tiny是一個基於SODA數據集訓練的超小型對話模型,專為邊緣設備推理設計,體積僅為Cosmo-3B模型的2%左右。
對話系統
Transformers English

C
ToddGoldfarb
2,691
6
Roberta Base Chinese Extractive Qa
基於RoBERTa架構的中文抽取式問答模型,適用於從給定文本中提取答案的任務。
問答系統 Chinese
R
uer
2,694
98