Model Overview
Model Features
Model Capabilities
Use Cases
đ FLUX.1-dev-ControlNet-Union-Pro-2.0 (FP8 Quantized)
This repository offers an FP8 quantized variant of the Shakker-Labs/FLUX.1-dev-ControlNet-Union-Pro-2.0 model. It's important to note that this isn't a fine - tuned model; instead, it's a direct quantization of the original BFloat16 model to FP8 format, aiming to optimize inference performance. An online demo is also provided.
đ Quick Start
This section provides a high - level overview of how to get started with the model. For detailed code examples, refer to the "đģ Usage Examples" section.
⨠Features
- Quantized Model: Directly quantized from BFloat16 to FP8 for better performance.
- Multiple Control Modes: Supports canny, soft edge, depth, pose, gray control modes.
- Multi - Inference: Allows for multi - controlnet inference.
đĻ Installation
To use this model, you need to have PyTorch with FP8 support installed. You can install the necessary libraries using pip
or other package managers.
pip install diffusers torch
đģ Usage Examples
Basic Usage
import torch
from diffusers.utils import load_image
from diffusers import FluxControlNetPipeline, FluxControlNetModel
base_model = 'black-forest-labs/FLUX.1-dev'
controlnet_model_union_fp8 = 'ABDALLALSWAITI/FLUX.1-dev-ControlNet-Union-Pro-2.0-fp8'
# Load using FP8 data type
controlnet = FluxControlNetModel.from_pretrained(controlnet_model_union_fp8, torch_dtype=torch.float8_e4m3fn)
pipe = FluxControlNetPipeline.from_pretrained(base_model, controlnet=controlnet, torch_dtype=torch.bfloat16)
pipe.to("cuda")
# replace with other conds
control_image = load_image("./conds/canny.png")
width, height = control_image.size
prompt = "A young girl stands gracefully at the edge of a serene beach, her long, flowing hair gently tousled by the sea breeze. She wears a soft, pastel-colored dress that complements the tranquil blues and greens of the coastal scenery. The golden hues of the setting sun cast a warm glow on her face, highlighting her serene expression. The background features a vast, azure ocean with gentle waves lapping at the shore, surrounded by distant cliffs and a clear, cloudless sky. The composition emphasizes the girl's serene presence amidst the natural beauty, with a balanced blend of warm and cool tones."
image = pipe(
prompt,
control_image=control_image,
width=width,
height=height,
controlnet_conditioning_scale=0.7,
control_guidance_end=0.8,
num_inference_steps=30,
guidance_scale=3.5,
generator=torch.Generator(device="cuda").manual_seed(42),
).images[0]
Advanced Usage
import torch
from diffusers.utils import load_image
# use local files for this moment
from pipeline_flux_controlnet import FluxControlNetPipeline
from controlnet_flux import FluxControlNetModel
base_model = 'black-forest-labs/FLUX.1-dev'
controlnet_model_union_fp8 = 'ABDALLALSWAITI/FLUX.1-dev-ControlNet-Union-Pro-2.0-fp8'
# Load using FP8 data type
controlnet = FluxControlNetModel.from_pretrained(controlnet_model_union_fp8, torch_dtype=torch.float8_e4m3fn)
pipe = FluxControlNetPipeline.from_pretrained(base_model, controlnet=[controlnet], torch_dtype=torch.bfloat16) # use [] to enable multi-CNs
pipe.to("cuda")
# replace with other conds
control_image = load_image("./conds/canny.png")
width, height = control_image.size
prompt = "A young girl stands gracefully at the edge of a serene beach, her long, flowing hair gently tousled by the sea breeze. She wears a soft, pastel-colored dress that complements the tranquil blues and greens of the coastal scenery. The golden hues of the setting sun cast a warm glow on her face, highlighting her serene expression. The background features a vast, azure ocean with gentle waves lapping at the shore, surrounded by distant cliffs and a clear, cloudless sky. The composition emphasizes the girl's serene presence amidst the natural beauty, with a balanced blend of warm and cool tones."
image = pipe(
prompt,
control_image=[control_image, control_image], # try with different conds such as canny&depth, pose&depth
width=width,
height=height,
controlnet_conditioning_scale=[0.35, 0.35],
control_guidance_end=[0.8, 0.8],
num_inference_steps=30,
guidance_scale=3.5,
generator=torch.Generator(device="cuda").manual_seed(42),
).images[0]
đ Documentation
Quantization Details
This model has been quantized from the original BFloat16 format to FP8 format using PyTorch's native FP8 support. Here are the specifics:
Property | Details |
---|---|
Quantization Technique | Native FP8 quantization |
Precision | E4M3 format (4 bits for exponent, 3 bits for mantissa) |
Library Used | PyTorch's built - in FP8 support |
Data Type | torch.float8_e4m3fn |
Original Model | BFloat16 format (Shakker - Labs/FLUX.1 - dev - ControlNet - Union - Pro - 2.0) |
Model Size Reduction | ~50% smaller than the original model |
The benefits of FP8 quantization include:
- Reduced Memory Usage: Approximately 50% smaller model size compared to BFloat16/FP16.
- Faster Inference: Potential speed improvements, especially on hardware with FP8 support.
- Minimal Quality Loss: Carefully calibrated quantization process to preserve output quality.
Keynotes
In comparison with [Shakker - Labs/FLUX.1 - dev - ControlNet - Union - Pro](https://huggingface.co/Shakker - Labs/FLUX.1 - dev - ControlNet - Union - Pro):
- Smaller Model Size: Removed mode embedding, resulting in a smaller model.
- Improved Performance: Enhanced performance on canny and pose, offering better control and aesthetics.
- New Feature: Added support for soft edge, removed support for tile.
Model Cards
- Model Structure: This ControlNet consists of 6 double blocks and 0 single block. Mode embedding is removed.
- Training Details: Trained from scratch for 300k steps using a dataset of 20M high - quality general and human images. Trained at 512x512 resolution in BFloat16, with batch size = 128, learning rate = 2e - 5, and guidance uniformly sampled from [1, 7]. The text drop ratio is set to 0.20.
- Control Modes: Supports multiple control modes, including canny, soft edge, depth, pose, gray. It can be used like a normal ControlNet.
- Joint Usage: Can be jointly used with other ControlNets.
Showcases
Control Mode | Image |
---|---|
Canny | ![]() |
Soft Edge | ![]() |
Pose | ![]() |
Depth | ![]() |
Gray | ![]() |
Recommended Parameters
You can adjust controlnet_conditioning_scale
and control_guidance_end
for stronger control and better detail preservation. For better stability, it's highly recommended to use detailed prompts. In some cases, multi - conditions can be helpful.
Control Mode | Condition Source | controlnet_conditioning_scale | control_guidance_end |
---|---|---|---|
Canny | cv2.Canny | 0.7 | 0.8 |
Soft Edge | AnylineDetector | 0.7 | 0.8 |
Depth | [depth - anything](https://github.com/DepthAnything/Depth - Anything - V2) | 0.8 | 0.8 |
Pose | [DWPose](https://github.com/IDEA - Research/DWPose/tree/onnx) | 0.9 | 0.65 |
Gray | cv2.cvtColor | 0.9 | 0.8 |
Using FP8 Model
This repository includes the FP8 quantized version of the model. To use it, you'll need PyTorch with FP8 support:
import torch
from diffusers.utils import load_image
from diffusers import FluxControlNetPipeline, FluxControlNetModel
base_model = 'black-forest-labs/FLUX.1-dev'
controlnet_model_union_fp8 = 'ABDALLALSWAITI/FLUX.1-dev-ControlNet-Union-Pro-2.0-fp8'
# Load using FP8 data type
controlnet = FluxControlNetModel.from_pretrained(controlnet_model_union_fp8, torch_dtype=torch.float8_e4m3fn)
pipe = FluxControlNetPipeline.from_pretrained(base_model, controlnet=controlnet, torch_dtype=torch.bfloat16)
pipe.to("cuda")
# The rest of the code is the same as with the original model
See fp8_inference_example.py
for a complete example.
Resources
- [InstantX/FLUX.1 - dev - IP - Adapter](https://huggingface.co/InstantX/FLUX.1 - dev - IP - Adapter)
- [InstantX/FLUX.1 - dev - Controlnet - Canny](https://huggingface.co/InstantX/FLUX.1 - dev - Controlnet - Canny)
- [Shakker - Labs/FLUX.1 - dev - ControlNet - Depth](https://huggingface.co/Shakker - Labs/FLUX.1 - dev - ControlNet - Depth)
- [Shakker - Labs/FLUX.1 - dev - ControlNet - Union - Pro](https://huggingface.co/Shakker - Labs/FLUX.1 - dev - ControlNet - Union - Pro)
Acknowledgements
This model is developed by [Shakker Labs](https://huggingface.co/Shakker - Labs). The original idea is inspired by [xinsir/controlnet - union - sdxl - 1.0](https://huggingface.co/xinsir/controlnet - union - sdxl - 1.0). All copyright reserved.
đ License
This model is released under the [flux - 1 - dev - non - commercial - license](https://huggingface.co/black - forest - labs/FLUX.1 - dev/blob/main/LICENSE.md).