FLUX.1-dev-ControlNet-Union-Pro-2.0オープンソースモデル - 複数の制御モードをサポートし、美的表現を向上させます

FLUX.1 Dev ControlNet Union Pro 2.0

Shakker-Labsによって開発

FLUX.1-devモデルを基にした統合ControlNetで、複数の制御モードをサポートし、制御効果と美的表現を改善

画像生成英語オープンソースライセンス:その他 #多条件制御 #美的感覚の強化 #画像生成の最適化

ダウンロード数 20.40k

リリース時間 : 4/14/2025

モデル概要

このモデルはテキストから画像を生成するタスク用のControlNetで、canny、soft edge、depth、pose、grayなどの複数の制御モードをサポートし、FLUX.1-dev基本モデルと組み合わせて高品質な画像を生成できます。

モデル特徴

多制御モードサポート

canny、soft edge、depth、pose、grayなどの複数の制御モードをサポート

改善された制御効果

前バージョンと比較し、cannyとposeの制御効果と美的表現を改善

サイズ最適化

モード埋め込みを削除し、モデルサイズを縮小

多重制御サポート

他のControlNetと組み合わせて使用可能で、複数条件の制御を実現

モデル能力

テキストから画像生成

画像条件制御

多条件連合制御

使用事例

クリエイティブデザイン

ポートレート生成

ポーズ図に基づいて高品質なポートレートを生成

ポーズ要件を満たし美的価値のあるポートレートを生成

シーン生成

深度マップに基づいて3Dシーンを生成

深度関係に合致したリアルなシーンを生成

アート創作

アートスタイル変換

エッジマップに基づいて異なるアートスタイルの画像を生成

元の構造を保持しながら異なるアートスタイルを適用

🚀 FLUX.1-dev-ControlNet-Union-Pro-2.0

このリポジトリには、Shakker Labs がリリースした FLUX.1-dev モデル用の統合 ControlNet が含まれています。オンラインデモも提供しています。コミュニティによって提供された FP8 量子化バージョンは ABDALLALSWAITI/FLUX.1-dev-ControlNet-Union-Pro-2.0-fp8 で見つけることができます。

✨ 主な機能

Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro と比較して、以下の点が改善されています。

モード埋め込みを削除し、モデルサイズを縮小。
Canny とポーズに関して改善を加え、より良いコントロールと美学性を実現。
ソフトエッジのサポートを追加。タイルのサポートを削除。

📚 ドキュメント

モデルカード

この ControlNet は 6 つのダブルブロックと 0 つのシングルブロックで構成されており、モード埋め込みは削除されています。
2000 万枚の高品質な一般画像と人物画像のデータセットを使用して、512x512 解像度、BFloat16、バッチサイズ 128、学習率 2e-5 で 30 万ステップのトレーニングを行いました。ガイダンスは [1, 7] から均一にサンプリングし、テキストドロップ率は 0.20 に設定しました。
このモデルは Canny、ソフトエッジ、深度、ポーズ、グレーなどの複数のコントロールモードをサポートしており、通常の ControlNet と同じように使用できます。
他の ControlNet と併用することも可能です。

展示例

コントロールモード	画像
Canny
ソフトエッジ
ポーズ
深度
グレー

💻 使用例

基本的な使用法

import torch
from diffusers.utils import load_image
from diffusers import FluxControlNetPipeline, FluxControlNetModel

base_model = 'black-forest-labs/FLUX.1-dev'
controlnet_model_union = 'Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro-2.0'

controlnet = FluxControlNetModel.from_pretrained(controlnet_model_union, torch_dtype=torch.bfloat16)
pipe = FluxControlNetPipeline.from_pretrained(base_model, controlnet=controlnet, torch_dtype=torch.bfloat16)
pipe.to("cuda")

# replace with other conds
control_image = load_image("./conds/canny.png")
width, height = control_image.size

prompt = "A young girl stands gracefully at the edge of a serene beach, her long, flowing hair gently tousled by the sea breeze. She wears a soft, pastel-colored dress that complements the tranquil blues and greens of the coastal scenery. The golden hues of the setting sun cast a warm glow on her face, highlighting her serene expression. The background features a vast, azure ocean with gentle waves lapping at the shore, surrounded by distant cliffs and a clear, cloudless sky. The composition emphasizes the girl's serene presence amidst the natural beauty, with a balanced blend of warm and cool tones."

image = pipe(
    prompt, 
    control_image=control_image,
    width=width,
    height=height,
    controlnet_conditioning_scale=0.7,
    control_guidance_end=0.8,
    num_inference_steps=30, 
    guidance_scale=3.5,
    generator=torch.Generator(device="cuda").manual_seed(42),
).images[0]

高度な使用法

import torch
from diffusers.utils import load_image

# https://github.com/huggingface/diffusers/pull/11350
# You can directly import from diffusers by install the laster version from source
# from diffusers import FluxControlNetPipeline, FluxControlNetModel

# use local files for this moment
from pipeline_flux_controlnet import FluxControlNetPipeline
from controlnet_flux import FluxControlNetModel

base_model = 'black-forest-labs/FLUX.1-dev'
controlnet_model_union = 'Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro-2.0'

controlnet = FluxControlNetModel.from_pretrained(controlnet_model_union, torch_dtype=torch.bfloat16)
pipe = FluxControlNetPipeline.from_pretrained(base_model, controlnet=[controlnet], torch_dtype=torch.bfloat16) # use [] to enable multi-CNs
pipe.to("cuda")

# replace with other conds
control_image = load_image("./conds/canny.png")
width, height = control_image.size

prompt = "A young girl stands gracefully at the edge of a serene beach, her long, flowing hair gently tousled by the sea breeze. She wears a soft, pastel-colored dress that complements the tranquil blues and greens of the coastal scenery. The golden hues of the setting sun cast a warm glow on her face, highlighting her serene expression. The background features a vast, azure ocean with gentle waves lapping at the shore, surrounded by distant cliffs and a clear, cloudless sky. The composition emphasizes the girl's serene presence amidst the natural beauty, with a balanced blend of warm and cool tones."

image = pipe(
    prompt, 
    control_image=[control_image, control_image], # try with different conds such as canny&depth, pose&depth
    width=width,
    height=height,
    controlnet_conditioning_scale=[0.35, 0.35],
    control_guidance_end=[0.8, 0.8],
    num_inference_steps=30, 
    guidance_scale=3.5,
    generator=torch.Generator(device="cuda").manual_seed(42),
).images[0]

推奨パラメータ

コントロールを強化し、詳細をより良く保存するために、controlnet_conditioning_scale と control_guidance_end を調整することができます。より安定した結果を得るために、詳細なプロンプトの使用を強くおすすめします。場合によっては、複数の条件を使用すると効果的です。

Canny: cv2.Canny を使用し、controlnet_conditioning_scale=0.7、control_guidance_end=0.8。
ソフトエッジ: AnylineDetector を使用し、controlnet_conditioning_scale=0.7、control_guidance_end=0.8。
深度: depth-anything を使用し、controlnet_conditioning_scale=0.8、control_guidance_end=0.8。
ポーズ: DWPose を使用し、controlnet_conditioning_scale=0.9、control_guidance_end=0.65。
グレー: cv2.cvtColor を使用し、controlnet_conditioning_scale=0.9、control_guidance_end=0.8。

🔗 関連リソース

🙏 謝辞

このモデルは Shakker Labs によって開発されました。元のアイデアは xinsir/controlnet-union-sdxl-1.0 から着想を得ています。すべての著作権は保持されています。

📖 引用

このプロジェクトがあなたの研究に役立った場合は、以下のように引用してください。

@misc{flux-cn-union-pro-2,
    author = {Shakker-Labs},
    title = {ControlNet-Union},
    year = {2025},
    howpublished={\url{https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro-2.0}},
}