InstructCLIP-InstructPix2Pix开源图像编辑模型 - 按文本指令轻松修改图片

首页

Instructclip InstructPix2Pix

由 SherryXTChen 开发

InstructCLIP是一种通过对比学习自动数据优化改进指令引导的图像编辑模型，结合了CLIP和稳定扩散技术，能够根据文本指令编辑图像。

文本生成图像英语开源协议:Apache-2.0 #指令引导图像编辑 #对比学习优化 #多模态指令理解

下载量 450

发布时间 : 3/15/2025

模型简介

该模型基于稳定扩散架构，结合CLIP的对比学习能力，专注于通过文本指令引导的图像编辑任务。

模型特点

指令引导编辑

能够根据自然语言指令对图像进行编辑和转换

对比学习优化

利用CLIP的对比学习能力自动优化数据质量

稳定扩散基础

基于稳定扩散架构，提供高质量的图像生成能力

模型能力

文本到图像转换

图像到图像转换

基于指令的图像编辑

图像风格转换

使用案例

创意设计

艺术风格转换

将普通照片转换为3D雕塑或其他艺术风格

示例中展示了将照片转换为3D雕塑的效果

内容创作

图像内容修改

根据文本指令修改图像中的特定元素

🚀 InstructCLIP：利用对比学习的自动数据优化改进指令引导的图像编辑

本模型基于论文 Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning 构建。该模型借助对比学习的自动数据优化技术，有效提升了指令引导的图像编辑效果，为图像编辑领域带来了更高效、精准的解决方案。

GitHub 链接：https://github.com/SherryXTChen/Instruct-CLIP.git

🚀 快速开始

本模型基于论文 Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning 构建。你可以通过以下步骤快速使用该模型进行图像编辑。

✨ 主要特性

多基础模型支持：支持 timbrooks/instruct-pix2pix、SherryXTChen/Instruct-CLIP 和 SherryXTChen/LatentDiffusionDINOv2 等多个基础模型。
特定数据集适配：使用 SherryXTChen/InstructCLIP-InstructPix2Pix-Data 数据集进行训练，确保模型在指令引导图像编辑任务上的性能。
多领域应用：适用于稳定扩散、文本到图像、图像到图像等多个领域，具有广泛的应用场景。

📦 安装指南

文档中未提及具体安装步骤，可参考 diffusers 库的官方文档进行安装。

💻 使用示例

基础用法

import PIL
import requests
import torch
from diffusers import StableDiffusionInstructPix2PixPipeline, EulerAncestralDiscreteScheduler

model_id = "timbrooks/instruct-pix2pix"
pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe.load_lora_weights("SherryXTChen/InstructCLIP-InstructPix2Pix")
pipe.to("cuda")
pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)

url = "https://raw.githubusercontent.com/SherryXTChen/Instruct-CLIP/refs/heads/main/assets/1_input.jpg"
def download_image(url):
    image = PIL.Image.open(requests.get(url, stream=True).raw)
    image = PIL.ImageOps.exif_transpose(image)
    image = image.convert("RGB")
    return image
image = download_image(url)

prompt = "as a 3 d sculpture"
images = pipe(prompt, image=image, num_inference_steps=20).images
images[0].save("output.jpg")

高级用法

文档中未提及高级用法的代码示例，可根据实际需求对基础用法代码进行扩展。

📚 详细文档

文档中未提供详细的使用说明和参数解释，可参考 diffusers 库的官方文档获取更多信息。

🔧 技术细节

文档中未提供具体的技术实现细节，可参考论文 Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning 了解模型的技术原理。

📄 许可证

本项目采用 apache-2.0 许可证。

📋 模型信息

属性	详情
基础模型	timbrooks/instruct-pix2pix、SherryXTChen/Instruct-CLIP、SherryXTChen/LatentDiffusionDINOv2
数据集	SherryXTChen/InstructCLIP-InstructPix2Pix-Data
语言	en
库名称	diffusers
许可证	apache-2.0
标签	stable-diffusion、stable-diffusion-diffusers、text-to-image、diffusers、diffusers-training、image-to-image
推理	true
管道标签	image-to-image