đ flux-controlnet-lora-test
This is a ControlNet PEFT LoRA derived from black-forest-labs/flux.1-dev, designed for text-to-image and image-to-image tasks.
đ Quick Start
This is a ControlNet PEFT LoRA derived from black-forest-labs/flux.1-dev.
The main validation prompt used during training was:
A photo-realistic image of a cat
⨠Features
- Supports text-to-image and image-to-image tasks.
- Utilizes ControlNet and LoRA for enhanced performance.
đĻ Installation
The installation is mainly about setting up the necessary Python environment and loading the model and adapter. You need to have Python and relevant deep learning libraries installed. The specific installation steps are shown in the inference code.
đģ Usage Examples
Basic Usage
import torch
from diffusers import DiffusionPipeline
model_id = 'black-forest-labs/flux.1-dev'
adapter_id = 'bghira/flux-controlnet-lora-test'
pipeline = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16)
pipeline.load_lora_weights(adapter_id)
prompt = "A photo-realistic image of a cat"
from optimum.quanto import quantize, freeze, qint8
quantize(pipeline.transformer, weights=qint8)
freeze(pipeline.transformer)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
model_output = pipeline(
prompt=prompt,
num_inference_steps=16,
generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(42),
width=256,
height=256,
guidance_scale=4.0,
).images[0]
model_output.save("output.png", format="PNG")
đ Documentation
Validation settings
Property |
Details |
CFG |
4.0 |
CFG Rescale |
0.0 |
Steps |
16 |
Sampler |
FlowMatchEulerDiscreteScheduler |
Seed |
42 |
Resolution |
256x256 |
Skip-layer guidance |
- |
Note: The validation settings are not necessarily the same as the training settings.
You can find some example images in the following gallery:
The text encoder was not trained. You may reuse the base model text encoder for inference.
Training settings
Property |
Details |
Training epochs |
8 |
Training steps |
250 |
Learning rate |
0.0001 |
Learning rate schedule |
constant |
Warmup steps |
500 |
Max grad value |
2.0 |
Effective batch size |
1 |
Micro-batch size |
1 |
Gradient accumulation steps |
1 |
Number of GPUs |
1 |
Gradient checkpointing |
True |
Prediction type |
flow_matching (extra parameters=['shift=3.0', 'flux_guidance_mode=constant', 'flux_guidance_value=1.0', 'flux_lora_target=controlnet']) |
Optimizer |
adamw_bf16 |
Trainable parameter precision |
Pure BF16 |
Base model precision |
int8-quanto |
Caption dropout probability |
0.0% |
LoRA Rank |
64 |
LoRA Alpha |
64.0 |
LoRA Dropout |
0.1 |
LoRA initialisation style |
default |
Datasets - antelope-data-256
Property |
Details |
Repeats |
0 |
Total number of images |
29 |
Total number of aspect buckets |
1 |
Resolution |
0.065536 megapixels |
Cropped |
True |
Crop style |
center |
Crop aspect |
square |
Used for regularisation data |
No |
đ§ Technical Details
The model is a ControlNet PEFT LoRA derived from black-forest-labs/flux.1-dev. During training, specific validation prompts and settings were used. The text encoder was not trained, and the base model text encoder can be reused for inference. The training settings involve multiple parameters such as learning rate, batch size, and optimizer. The dataset used for training has specific characteristics like the number of images and resolution.
đ License
License: other