Controlnet Tile Sdxl 1.0
基于Stable Diffusion XL的ControlNet Tile模型,专注于图像细节修复、变体生成和超分辨率处理
下载量 20.64k
发布时间 : 6/26/2024
模型简介
该模型通过Tile控制网络实现对图像的精细控制,主要用于图像去模糊、细节修复、风格变体生成以及超分辨率放大等任务
模型特点
图像细节修复
能够对模糊或低质量图像进行细节修复和增强
风格变体生成
可在保留原图主要内容的同时生成不同风格的变体
超分辨率放大
支持任意比例的超分辨率放大,最高可达3倍
宽高比自适应
支持任意宽高比的图像处理
模型能力
图像去模糊
细节增强
风格转换
超分辨率放大
图像变体生成
使用案例
图像修复
模糊图像修复
对模糊或低质量图像进行细节修复
显著提升图像清晰度和细节表现
创意设计
图像风格变体
生成保留原图内容但风格不同的变体
获得多样化风格的艺术作品
图像增强
超分辨率放大
对低分辨率图像进行高质量放大
3倍放大后仍保持良好细节
🚀 ControlNet Tile SDXL
ControlNet Tile SDXL 是一个用于文本到图像生成的项目,它能够实现图像去模糊、图像变体生成以及图像超分辨率等功能,为图像生成和处理提供了多样化的解决方案。
🚀 快速开始
本项目可实现图像去模糊、图像变体生成以及图像超分辨率等功能,以下是相关示例和使用代码。
✨ 主要特性
- 图像去模糊:可对模糊图像进行去模糊处理,恢复图像细节。
- 图像变体生成:类似于 Midjourney,能根据输入图像生成不同变体的图像。
- 图像超分辨率:支持任意宽高比和任意倍数的图像放大,如 3 * 3 倍放大。
📦 安装指南
文档未提供具体安装步骤,可参考代码引用中的相关链接:
- https://huggingface.co/TTPlanet/TTPLanet_SDXL_Controlnet_Tile_Realistic/blob/main/TTP_tile_preprocessor_v5.py
- https://github.com/lllyasviel/ControlNet-v1-1-nightly/blob/main/gradio_tile.py
💻 使用示例
基础用法
图像去模糊(Tile blur)
from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL
from diffusers import DDIMScheduler, EulerAncestralDiscreteScheduler
from PIL import Image
from guided_filter import FastGuidedFilter # I have upload this file in this repo
import torch
import numpy as np
import cv2
def resize_image_control(control_image, resolution):
HH, WW, _ = control_image.shape
crop_h = random.randint(0, HH - resolution[1])
crop_w = random.randint(0, WW - resolution[0])
crop_image = control_image[crop_h:crop_h+resolution[1], crop_w:crop_w+resolution[0], :]
return crop_image, crop_w, crop_h
def apply_gaussian_blur(image_np, ksize=5, sigmaX=1.0):
if ksize % 2 == 0:
ksize += 1 # ksize must be odd
blurred_image = cv2.GaussianBlur(image_np, (ksize, ksize), sigmaX=sigmaX)
return blurred_image
def apply_guided_filter(image_np, radius, eps, scale):
filter = FastGuidedFilter(image_np, radius, eps, scale)
return filter.filter(image_np)
controlnet_conditioning_scale = 1.0
prompt = "your prompt, the longer the better, you can describe it as detail as possible"
negative_prompt = 'longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality'
eulera_scheduler = EulerAncestralDiscreteScheduler.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="scheduler")
controlnet = ControlNetModel.from_pretrained(
"xinsir/controlnet-tile-sdxl-1.0",
torch_dtype=torch.float16
)
# when test with other base model, you need to change the vae also.
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
controlnet=controlnet,
vae=vae,
safety_checker=None,
torch_dtype=torch.float16,
scheduler=eulera_scheduler,
)
controlnet_img = cv2.imread("your original image path")
height, width, _ = controlnet_img.shape
ratio = np.sqrt(1024. * 1024. / (width * height))
W, H = int(width * ratio), int(height * ratio)
crop_w, crop_h = 0, 0
controlnet_img = cv2.resize(controlnet_img, (W, H))
blur_strength = random.sample([i / 10. for i in range(10, 201, 2)], k=1)[0]
radius = random.sample([i for i in range(1, 40, 2)], k=1)[0]
eps = random.sample([i / 1000. for i in range(1, 101, 2)], k=1)[0]
scale_factor = random.sample([i / 10. for i in range(10, 181, 5)], k=1)[0]
if random.random() > 0.5:
controlnet_img = apply_gaussian_blur(controlnet_img, ksize=int(blur_strength), sigmaX=blur_strength / 2)
if random.random() > 0.5:
# Apply Guided Filter
controlnet_img = apply_guided_filter(controlnet_img, radius, eps, scale_factor)
# Resize image
controlnet_img = cv2.resize(controlnet_img, (int(W / scale_factor), int(H / scale_factor)), interpolation=cv2.INTER_AREA)
controlnet_img = cv2.resize(controlnet_img, (W, H), interpolation=cv2.INTER_CUBIC)
controlnet_img = cv2.cvtColor(controlnet_img, cv2.COLOR_BGR2RGB)
controlnet_img = Image.fromarray(controlnet_img)
# need to resize the image resolution to 1024 * 1024 or same bucket resolution to get the best performance
images = pipe(
prompt,
negative_prompt=negative_prompt,
image=controlnet_img,
controlnet_conditioning_scale=controlnet_conditioning_scale,
width=new_width,
height=new_height,
num_inference_steps=30,
).images
images[0].save(f"your image save path, png format is usually better than jpg or webp in terms of image quality but got much bigger")
图像变体生成(Tile var)
from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL
from diffusers import DDIMScheduler, EulerAncestralDiscreteScheduler
from PIL import Image
import torch
import numpy as np
import cv2
controlnet_conditioning_scale = 1.0
prompt = "your prompt, the longer the better, you can describe it as detail as possible"
negative_prompt = 'longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality'
eulera_scheduler = EulerAncestralDiscreteScheduler.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="scheduler")
controlnet = ControlNetModel.from_pretrained(
"xinsir/controlnet-tile-sdxl-1.0",
torch_dtype=torch.float16
)
# when test with other base model, you need to change the vae also.
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
controlnet=controlnet,
vae=vae,
safety_checker=None,
torch_dtype=torch.float16,
scheduler=eulera_scheduler,
)
controlnet_img = cv2.imread("your original image path")
height, width, _ = controlnet_img.shape
ratio = np.sqrt(1024. * 1024. / (width * height))
W, H = int(width * ratio), int(height * ratio)
crop_w, crop_h = 0, 0
controlnet_img = cv2.resize(controlnet_img, (W, H))
controlnet_img = cv2.cvtColor(controlnet_img, cv2.COLOR_BGR2RGB)
controlnet_img = Image.fromarray(controlnet_img)
# need to resize the image resolution to 1024 * 1024 or same bucket resolution to get the best performance
images = pipe(
prompt,
negative_prompt=negative_prompt,
image=controlnet_img,
controlnet_conditioning_scale=controlnet_conditioning_scale,
width=new_width,
height=new_height,
num_inference_steps=30,
).images
images[0].save(f"your image save path, png format is usually better than jpg or webp in terms of image quality but got much bigger")
图像超分辨率(Tile super)
from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL
from diffusers import DDIMScheduler, EulerAncestralDiscreteScheduler
from PIL import Image
import torch
import numpy as np
import cv2
controlnet_conditioning_scale = 1.0
prompt = "your prompt, the longer the better, you can describe it as detail as possible"
negative_prompt = 'longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality'
eulera_scheduler = EulerAncestralDiscreteScheduler.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="scheduler")
controlnet = ControlNetModel.from_pretrained(
"xinsir/controlnet-tile-sdxl-1.0",
torch_dtype=torch.float16
)
# when test with other base model, you need to change the vae also.
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
controlnet=controlnet,
vae=vae,
safety_checker=None,
torch_dtype=torch.float16,
scheduler=eulera_scheduler,
)
controlnet_img = cv2.imread("your original image path")
height, width, _ = controlnet_img.shape
ratio = np.sqrt(1024. * 1024. / (width * height))
W, H = int(width * ratio) // 48 * 48, int(height * ratio) // 48 * 48
controlnet_img = cv2.resize(controlnet_img, (W, H))
controlnet_img = cv2.cvtColor(controlnet_img, cv2.COLOR_BGR2RGB)
controlnet_img = Image.fromarray(controlnet_img)
# need to resize the image resolution to 1024 * 1024 or same bucket resolution to get the best performance
target_width = W // 3
target_height = H // 3
for i in range(3): # 两行
for j in range(3): # 两列
left = j * target_width
top = i * target_height
right = left + target_width
bottom = top + target_height
# 根据计算的边界裁剪图像
cropped_image = controlnet_img.crop((left, top, right, bottom))
cropped_image = cropped_image.resize((W, H))
images.append(cropped_image)
seed = random.randint(0, 2147483647)
generator = torch.Generator('cuda').manual_seed(seed)
result_images = []
for sub_img in images:
new_width, new_height = W, H
out = pipe(prompt=[prompt]*1,
image=sub_img,
control_image=sub_img,
negative_prompt=[negative_prompt]*1,
generator=generator,
width=new_width,
height=new_height,
num_inference_steps=30,
crops_coords_top_left=(W, H),
target_size=(W, H),
original_size=(W * 2, H * 2),
)
result_images.append(out.images[0])
new_im = Image.new('RGB', (new_width*3, new_height*3))
# 拼接图片到新的图像上
new_im.paste(result_images[0], (0, 0))
new_im.paste(result_images[1], (new_width, 0))
new_im.paste(result_images[2], (new_width * 2, 0))
new_im.paste(result_images[3], (0, new_height))
new_im.paste(result_images[4], (new_width, new_height))
new_im.paste(result_images[5], (new_width * 2, new_height))
new_im.paste(result_images[6], (0, new_height * 2))
new_im.paste(result_images[7], (new_width, new_height * 2))
new_im.paste(result_images[8], (new_width * 2, new_height * 2))
new_im.save(f"your image save path, png format is usually better than jpg or webp in terms of image quality but got much bigger")
高级用法
文档未提供高级用法相关内容。
📄 许可证
本项目采用 Apache-2.0 许可证。
Stable Diffusion V1 5
Openrail
稳定扩散是一种潜在的文本到图像扩散模型,能够根据任何文本输入生成逼真的图像。
图像生成
S
stable-diffusion-v1-5
3.7M
518
Stable Diffusion Inpainting
Openrail
基于稳定扩散的文本到图像生成模型,具备图像修复能力
图像生成
S
stable-diffusion-v1-5
3.3M
56
Stable Diffusion Xl Base 1.0
SDXL 1.0是基于扩散的文本生成图像模型,采用专家集成的潜在扩散流程,支持高分辨率图像生成
图像生成
S
stabilityai
2.4M
6,545
Stable Diffusion V1 4
Openrail
稳定扩散是一种潜在文本到图像扩散模型,能够根据任意文本输入生成逼真图像。
图像生成
S
CompVis
1.7M
6,778
Stable Diffusion Xl Refiner 1.0
SD-XL 1.0优化器模型是Stability AI开发的图像生成模型,专为提升SDXL基础模型生成的图像质量而设计,特别擅长最终去噪步骤处理。
图像生成
S
stabilityai
1.1M
1,882
Stable Diffusion 2 1
基于扩散的文本生成图像模型,支持通过文本提示生成和修改图像
图像生成
S
stabilityai
948.75k
3,966
Stable Diffusion Xl 1.0 Inpainting 0.1
基于Stable Diffusion XL的潜在文本到图像扩散模型,具备通过遮罩进行图像修复的功能
图像生成
S
diffusers
673.14k
334
Stable Diffusion 2 Base
基于扩散的文生图模型,可根据文本提示生成高质量图像
图像生成
S
stabilityai
613.60k
349
Playground V2.5 1024px Aesthetic
其他
开源文生图模型,能生成1024x1024分辨率及多种纵横比的美学图像,在美学质量上处于开源领域领先地位。
图像生成
P
playgroundai
554.94k
723
Sd Turbo
SD-Turbo是一款高速文本生成图像模型,仅需单次网络推理即可根据文本提示生成逼真图像。该模型作为研究原型发布,旨在探索小型蒸馏文本生成图像模型。
图像生成
S
stabilityai
502.82k
380
精选推荐AI模型
Llama 3 Typhoon V1.5x 8b Instruct
专为泰语设计的80亿参数指令模型,性能媲美GPT-3.5-turbo,优化了应用场景、检索增强生成、受限生成和推理任务
大型语言模型
Transformers 支持多种语言

L
scb10x
3,269
16
Cadet Tiny
Openrail
Cadet-Tiny是一个基于SODA数据集训练的超小型对话模型,专为边缘设备推理设计,体积仅为Cosmo-3B模型的2%左右。
对话系统
Transformers 英语

C
ToddGoldfarb
2,691
6
Roberta Base Chinese Extractive Qa
基于RoBERTa架构的中文抽取式问答模型,适用于从给定文本中提取答案的任务。
问答系统 中文
R
uer
2,694
98