🚀 ControlNet++: All-in-one ControlNet for image generations and editing!
ControlNet++ is an all - in - one solution for image generation and editing. It offers a ProMax model with 12 controls and 5 advanced editing features, enabling high - quality image output comparable to Midjourney.
🚀 Quick Start
Inference scripts and more details can be found at: https://github.com/xinsir6/ControlNetPlus/tree/main
✨ Features
ProMax Model Release
The ProMax model has been released! It comes with 12 controls and 5 advanced editing features. Just give it a try!
Visual Display

Network Architecture

Advantages of the Model
- High - Resolution Image Generation: Utilizes bucket training similar to NovelAI, capable of generating high - resolution images of any aspect ratio.
- Large - Scale High - Quality Data: Trained on a large amount of high - quality data (over 10000000 images), covering a wide range of scenarios.
- Enhanced Prompt Following: Employs re - captioned prompts like DALLE.3, using CogVLM to generate detailed descriptions, resulting in excellent prompt - following ability.
- Effective Training Tricks: Applies various useful tricks during training, including but not limited to data augmentation, multiple loss functions, and multi - resolution training.
- Low Parameter Increase: Has almost the same number of parameters as the original ControlNet, without a significant increase in network parameters or computation.
- Multiple Control Conditions: Supports 10+ control conditions, with no obvious performance drop on any single condition compared to independent training.
- Multi - Condition Generation: Supports multi - condition generation, with condition fusion learned during training. No need to set hyperparameters or design complex prompts.
- Compatibility: Compatible with other open - source SDXL models, such as BluePencilXL and CounterfeitXL, as well as other Lora models.
Technical Innovation
We designed a new architecture that can support 10+ control types in text - to - image generation and produce high - resolution images visually comparable to those of Midjourney. Based on the original ControlNet architecture, we proposed two new modules:
- Extend the original ControlNet to support different image conditions using the same network parameters.
- Enable multiple conditions input without increasing computation offload, which is crucial for designers who need detailed image editing. Different conditions share the same condition encoder, without adding extra computations or parameters.
We conducted thorough experiments on SDXL and achieved superior performance in both control ability and aesthetic score. We released the method and the model to the open - source community for everyone to enjoy.
💻 Usage Examples
Advanced Editing Features in ProMax Model
Tile Deblur

Tile Variation

Tile Super Resolution
The following examples show the transition from 1M resolution to 9M resolution:
Image Inpainting

Image Outpainting

Visual Examples
Openpose

Depth

Canny

Lineart

AnimeLineart

Mlsd

Scribble

Hed

Pidi(Softedge)

Teed

Segment

Normal

Multi - Control Visual Examples
Openpose + Canny

Openpose + Depth

Openpose + Scribble

Openpose + Normal

Openpose + Segment

📄 License
This project is licensed under the Apache - 2.0 license.
💡 Usage Tip
If you find it useful, please give me a star. Thank you very much! The SDXL ProMax version has been released. Enjoy it!
⚠️ Important Note
I'm sorry that due to the difficulty in balancing the project's revenue and expenditure, the GPU resources have been allocated to other more profitable projects. The SD3 training is stopped until I can find enough GPU support. I will try my best to find GPUs to continue training. If this causes any inconvenience, I sincerely apologize. I want to thank everyone who likes this project. Your support keeps me going.
Note: We put the promax model with a promax suffix in the same huggingface model repo. Detailed instructions will be added later.