FLUX.1-dev-edit-v0開源圖像編輯模型 - 免費實現多種風格轉換與內容修改

首頁

FLUX.1 Dev Edit V0

由sayakpaul開發

基於Flux控制框架的圖像編輯模型，支持多種風格轉換和內容修改

圖像生成開源協議:其他 #可控圖像編輯 #風格轉換 #條件擴散模型

下載量 114

發布時間 : 1/18/2025

模型概述

FLUX.1-dev是一個基於擴散模型的圖像編輯系統，能夠根據文本提示對輸入圖像進行風格轉換和內容編輯。該模型採用Flux控制框架進行微調，支持多種圖像編輯任務。

模型特點

精確圖像編輯

可根據文本提示精確修改圖像特定元素，如改變物體顏色或轉換季節場景

風格轉換

支持將圖像轉換為多種藝術風格，如日本木版畫、厚塗油畫等

高效推理

可通過Turbo LoRA技術實現8步快速推理，保持圖像質量

靈活控制

提供引導係數等參數調節編輯強度，滿足不同需求

模型能力

圖像風格轉換

物體屬性修改

場景內容編輯

藝術效果生成

使用案例

創意設計

藝術風格轉換

將普通照片轉換為傳統日本木版畫風格

https://huggingface.co/datasets/sayakpaul/sample-datasets/resolve/main/flux-edit-artifacts/edited_car.jpg

季節場景轉換

將普通風景照轉換為冬季雪景

https://huggingface.co/datasets/sayakpaul/sample-datasets/resolve/main/flux-edit-artifacts/edited_green_creature.jpg

產品設計

產品外觀修改

快速修改產品顏色或材質效果

https://huggingface.co/datasets/sayakpaul/sample-datasets/resolve/main/flux-edit-artifacts/edited_mushroom.jpg

🚀 Flux Edit

Flux Edit是一組用於圖像編輯的控制權重，基於特定模型和數據集訓練得到。它使用Flux Control框架進行微調，能夠實現多種圖像編輯功能。

🚀 快速開始

這些是在 black-forest-labs/FLUX.1-dev 和 TIGER-Lab/OmniEdit-Filtered-1.2M 上訓練的控制權重，用於圖像編輯。我們使用 Flux Control框架進行微調。

✨ 主要特性

基於特定模型和大規模數據集訓練，具備強大的圖像編輯能力。
支持使用Flux Control框架進行微調。
可通過不同參數設置（如 guidance_scale）和技術（如turbo LoRA、量化）優化推理效果和性能。

💻 使用示例

基礎用法

from diffusers import FluxControlPipeline, FluxTransformer2DModel
from diffusers.utils import load_image
import torch 

path = "sayakpaul/FLUX.1-dev-edit-v0" 
edit_transformer = FluxTransformer2DModel.from_pretrained(path, torch_dtype=torch.bfloat16)
pipeline = FluxControlPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-dev", transformer=edit_transformer, torch_dtype=torch.bfloat16
).to("cuda")

url = "https://huggingface.co/datasets/sayakpaul/sample-datasets/resolve/main/flux-edit-artifacts/assets/mushroom.jpg"
image = load_image(url) # resize as needed.
print(image.size)

prompt = "turn the color of mushroom to gray"
image = pipeline(
    control_image=image,
    prompt=prompt,
    guidance_scale=30., # change this as needed.
    num_inference_steps=50, # change this as needed.
    max_sequence_length=512,
    height=image.height,
    width=image.width,
    generator=torch.manual_seed(0)
).images[0]
image.save("edited_image.png")

高級用法

使用turbo LoRA加速推理：

from diffusers import FluxControlPipeline, FluxTransformer2DModel
from diffusers.utils import load_image
from huggingface_hub import hf_hub_download
import torch

path = "sayakpaul/FLUX.1-dev-edit-v0"
edit_transformer = FluxTransformer2DModel.from_pretrained(path, torch_dtype=torch.bfloat16)
pipeline = FluxControlPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-dev", transformer=edit_transformer, torch_dtype=torch.bfloat16
).to("cuda")

# load the turbo LoRA
pipeline.load_lora_weights(
    hf_hub_download("ByteDance/Hyper-SD", "Hyper-FLUX.1-dev-8steps-lora.safetensors"), adapter_name="hyper-sd"
)
pipeline.set_adapters(["hyper-sd"], adapter_weights=[0.125])


url = "https://huggingface.co/datasets/sayakpaul/sample-datasets/resolve/main/flux-edit-artifacts/assets/mushroom.jpg"
image = load_image(url) # resize as needed.
print(image.size)

prompt = "turn the color of mushroom to gray"
image = pipeline(
    control_image=image,
    prompt=prompt,
    guidance_scale=30., # change this as needed.
    num_inference_steps=8, # change this as needed.
    max_sequence_length=512,
    height=image.height,
    width=image.width,
    generator=torch.manual_seed(0)
).images[0]
image.save("edited_image.png")

推理速度對比

對比

50 steps	8 steps

guidance_scale對結果的影響

提示	組合圖 (gs: 10)	組合圖 (gs: 20)	組合圖 (gs: 30)	組合圖 (gs: 40)
Give this the look of a traditional Japanese woodblock print.
transform the setting to a winter scene
turn the color of mushroom to gray

🔧 技術細節

訓練細節

微調代碼庫位於這裡。訓練超參數如下：

每個GPU的批次大小：4
梯度累積步數：4
引導比例：30
BF16混合精度
AdamW優化器（來自 bitsandbytes 的8位）
恆定學習率：5e-5
權重衰減：1e-6
訓練步數：20000

訓練使用了一個包含8個H100的節點。

我們使用了一種簡化的流機制來進行線性插值，偽代碼如下：

sigmas = torch.rand(batch_size)
timesteps = (sigmas * noise_scheduler.config.num_train_timesteps).long()
...

noisy_model_input = (1.0 - sigmas) * pixel_latents + sigmas * noise

其中，pixel_latents 是從源圖像計算得到的，noise 是從高斯分佈中採樣得到的。更多細節請查看倉庫。