FLUX.1-dev-edit-v0开源图像编辑模型 - 免费实现多种风格转换与内容修改

首页

FLUX.1 Dev Edit V0

由 sayakpaul 开发

基于Flux控制框架的图像编辑模型，支持多种风格转换和内容修改

图像生成开源协议:其他 #可控图像编辑 #风格转换 #条件扩散模型

下载量 114

发布时间 : 1/18/2025

模型简介

FLUX.1-dev是一个基于扩散模型的图像编辑系统，能够根据文本提示对输入图像进行风格转换和内容编辑。该模型采用Flux控制框架进行微调，支持多种图像编辑任务。

模型特点

精确图像编辑

可根据文本提示精确修改图像特定元素，如改变物体颜色或转换季节场景

风格转换

支持将图像转换为多种艺术风格，如日本木版画、厚涂油画等

高效推理

可通过Turbo LoRA技术实现8步快速推理，保持图像质量

灵活控制

提供引导系数等参数调节编辑强度，满足不同需求

模型能力

图像风格转换

物体属性修改

场景内容编辑

艺术效果生成

使用案例

创意设计

艺术风格转换

将普通照片转换为传统日本木版画风格

https://huggingface.co/datasets/sayakpaul/sample-datasets/resolve/main/flux-edit-artifacts/edited_car.jpg

季节场景转换

将普通风景照转换为冬季雪景

https://huggingface.co/datasets/sayakpaul/sample-datasets/resolve/main/flux-edit-artifacts/edited_green_creature.jpg

产品设计

产品外观修改

快速修改产品颜色或材质效果

https://huggingface.co/datasets/sayakpaul/sample-datasets/resolve/main/flux-edit-artifacts/edited_mushroom.jpg

🚀 Flux Edit

Flux Edit是一组用于图像编辑的控制权重，基于特定模型和数据集训练得到。它使用Flux Control框架进行微调，能够实现多种图像编辑功能。

🚀 快速开始

这些是在 black-forest-labs/FLUX.1-dev 和 TIGER-Lab/OmniEdit-Filtered-1.2M 上训练的控制权重，用于图像编辑。我们使用 Flux Control框架进行微调。

✨ 主要特性

基于特定模型和大规模数据集训练，具备强大的图像编辑能力。
支持使用Flux Control框架进行微调。
可通过不同参数设置（如 guidance_scale）和技术（如turbo LoRA、量化）优化推理效果和性能。

💻 使用示例

基础用法

from diffusers import FluxControlPipeline, FluxTransformer2DModel
from diffusers.utils import load_image
import torch 

path = "sayakpaul/FLUX.1-dev-edit-v0" 
edit_transformer = FluxTransformer2DModel.from_pretrained(path, torch_dtype=torch.bfloat16)
pipeline = FluxControlPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-dev", transformer=edit_transformer, torch_dtype=torch.bfloat16
).to("cuda")

url = "https://huggingface.co/datasets/sayakpaul/sample-datasets/resolve/main/flux-edit-artifacts/assets/mushroom.jpg"
image = load_image(url) # resize as needed.
print(image.size)

prompt = "turn the color of mushroom to gray"
image = pipeline(
    control_image=image,
    prompt=prompt,
    guidance_scale=30., # change this as needed.
    num_inference_steps=50, # change this as needed.
    max_sequence_length=512,
    height=image.height,
    width=image.width,
    generator=torch.manual_seed(0)
).images[0]
image.save("edited_image.png")

高级用法

使用turbo LoRA加速推理：

from diffusers import FluxControlPipeline, FluxTransformer2DModel
from diffusers.utils import load_image
from huggingface_hub import hf_hub_download
import torch

path = "sayakpaul/FLUX.1-dev-edit-v0"
edit_transformer = FluxTransformer2DModel.from_pretrained(path, torch_dtype=torch.bfloat16)
pipeline = FluxControlPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-dev", transformer=edit_transformer, torch_dtype=torch.bfloat16
).to("cuda")

# load the turbo LoRA
pipeline.load_lora_weights(
    hf_hub_download("ByteDance/Hyper-SD", "Hyper-FLUX.1-dev-8steps-lora.safetensors"), adapter_name="hyper-sd"
)
pipeline.set_adapters(["hyper-sd"], adapter_weights=[0.125])


url = "https://huggingface.co/datasets/sayakpaul/sample-datasets/resolve/main/flux-edit-artifacts/assets/mushroom.jpg"
image = load_image(url) # resize as needed.
print(image.size)

prompt = "turn the color of mushroom to gray"
image = pipeline(
    control_image=image,
    prompt=prompt,
    guidance_scale=30., # change this as needed.
    num_inference_steps=8, # change this as needed.
    max_sequence_length=512,
    height=image.height,
    width=image.width,
    generator=torch.manual_seed(0)
).images[0]
image.save("edited_image.png")

推理速度对比

对比

50 steps	8 steps

guidance_scale对结果的影响

提示	组合图 (gs: 10)	组合图 (gs: 20)	组合图 (gs: 30)	组合图 (gs: 40)
Give this the look of a traditional Japanese woodblock print.
transform the setting to a winter scene
turn the color of mushroom to gray

🔧 技术细节

训练细节

微调代码库位于这里。训练超参数如下：

每个GPU的批次大小：4
梯度累积步数：4
引导比例：30
BF16混合精度
AdamW优化器（来自 bitsandbytes 的8位）
恒定学习率：5e-5
权重衰减：1e-6
训练步数：20000

训练使用了一个包含8个H100的节点。

我们使用了一种简化的流机制来进行线性插值，伪代码如下：

sigmas = torch.rand(batch_size)
timesteps = (sigmas * noise_scheduler.config.num_train_timesteps).long()
...

noisy_model_input = (1.0 - sigmas) * pixel_latents + sigmas * noise

其中，pixel_latents 是从源图像计算得到的，noise 是从高斯分布中采样得到的。更多细节请查看仓库。