ICEdit-MoE-LoRA Open-Source Image Editing Model - Achieve Advanced Image Editing Effects with a Small Number of Data Parameters

Icedit MoE LoRA

Developed by sanaka87

ICEdit is an instructional image editing method based on large-scale diffusion transformers, achieving state-of-the-art editing effects with only 0.5% of training data and 1% of parameters.

Image Generation Supports Multiple LanguagesOpen Source License:Apache-2.0 #Instructional Image Editing #Low-Data Training #Art Style Transfer

Downloads 4,202

Release Time : 4/30/2025

Model Overview

ICEdit enables instructional image editing through in-context generation, supporting high-precision multi-round editing and diverse single-round editing effects, excelling particularly in style transfer, attribute modification, and background changes.

Model Features

Efficient Training

Achieves state-of-the-art editing effects using only 0.5% of the training data and 1% of the parameters required by previous SOTA methods.

Multi-Round Editing

Supports high-precision execution of continuous multi-round editing operations.

Diverse Editing

Produces diverse and visually impressive results in single-round editing.

Lightweight

Compact model parameters suitable for deployment in resource-limited environments.

Model Capabilities

Add objects

Modify color attributes

Apply style transfer

Change background

Object removal (lower success rate)

Use Cases

Artistic Creation

Style Transfer

Convert photos into different artistic styles

High-quality style transfer effects

Attribute Editing

Modify attributes such as clothing and hairstyle

Natural and realistic editing effects

Commercial Design

Advertisement Material Editing

Quickly modify product displays in advertisements

Efficient generation of diverse advertisement materials

🚀 In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer

In-Context Edit is a novel approach that achieves state - of - the - art instruction - based image editing with significantly less training data and parameters compared to prior SOTA methods.

🚀 Quick Start

This repository will contain the official implementation of ICEdit.

✨ Features

Efficient Editing: Achieves state - of - the - art instruction - based editing using just 0.5% of the training data and 1% of the parameters required by prior SOTA methods.
High Success Rates in Multiple Tasks: High success rates for adding objects, modifying color attributes, applying style transfer, and changing backgrounds.

📦 Installation

Conda environment setup

conda create -n icedit python=3.10
conda activate icedit
pip install -r requirements.txt
pip install -U huggingface_hub

Download pretrained weights

If you can connect to Huggingface, you don't need to download the weights. Otherwise, you need to download the weights to local.

💻 Usage Examples

Basic Usage

Our model can only edit images with a width of 512 pixels (there is no restriction on the height). If you pass in an image with a width other than 512 pixels, the model will automatically resize it to 512 pixels.

python scripts/inference.py --image assets/girl.png \
                            --instruction "Make her hair dark green and her clothes checked." \
                            --seed 42 \

Editing a 512×768 image requires 35 GB of GPU memory. If you need to run on a system with 24 GB of GPU memory (for example, an NVIDIA RTX3090), you can add the --enable-model-cpu-offload parameter.

python scripts/inference.py --image assets/girl.png \
                            --instruction "Make her hair dark green and her clothes checked." \
                            --enable-model-cpu-offload

If you have downloaded the pretrained weights locally, please pass the parameters during inference, as in:

python scripts/inference.py --image assets/girl.png \
                            --instruction "Make her hair dark green and her clothes checked." \
                            --flux-path /path/to/flux.1-fill-dev \
                            --lora-path /path/to/ICEdit-MoE-LoRA

Advanced Usage

We provide a gradio demo for you to edit images in a more user - friendly way. You can run the following command to start the demo.

python scripts/gradio_demo.py --port 7860

Like the inference script, if you want to run the demo on a system with 24 GB of GPU memory, you can add the --enable-model-cpu-offload parameter. And if you have downloaded the pretrained weights locally, please pass the parameters during inference, as in:

python scripts/gradio_demo.py --port 7860 \
                              --flux-path /path/to/flux.1-fill-dev (optional) \
                              --lora-path /path/to/ICEdit-MoE-LoRA (optional) \
                              --enable-model-cpu-offload (optional) \

Then you can open the link in your browser to edit images.

🎨 Enjoy your editing!

📚 Documentation

⚠️ Important Note

If you encounter such a failure case, please try again with a different seed!
Our base model, FLUX, does not inherently support a wide range of styles, so a large portion of our dataset involves style transfer. As a result, the model may sometimes inexplicably change your artistic style.
Our training dataset is mostly targeted at realistic images. For non - realistic images, such as anime or blurry pictures, the success rate of the editing drop and could potentially affect the final image quality.
While the success rates for adding objects, modifying color attributes, applying style transfer, and changing backgrounds are high, the success rate for object removal is relatively lower due to the low quality of the OmniEdit removal dataset.

The current model is the one used in the experiments in the paper, trained with only 4 A800 GPUs (total batch_size = 2 x 2 x 4 = 16). In the future, we will enhance the dataset, and do scale - up, finally release a more powerful model.

To Do List

[x] Inference Code
[ ] Inference-time Scaling with VLM
[x] Pretrained Weights
[ ] More Inference Demos
[x] Gradio demo
[ ] Comfy UI demo
[ ] Training Code

🎆 News

[2025/4/30] 🔥 We release the Huggingface Demo 🤗! Have a try!
[2025/4/30] 🔥 We release the inference code and pretrained weights on Huggingface 🤗!
[2025/4/30] 🔥 We release the paper on arXiv!
[2025/4/29] We release the project page and demo video! Codes will be made available in next week~ Happy Labor Day!

Comparison with Commercial Models

Compared with commercial models such as Gemini and GPT - 4o, our methods are comparable to and even superior to these commercial models in terms of character ID preservation and instruction following. We are more open - source than them, with lower costs, faster speed (it takes about 9 seconds to process one image), and powerful performance.

📄 License

This project is licensed under the Apache - 2.0 license.

🔧 Technical Details

Bibtex

If this work is helpful for your research, please consider citing the following BibTeX entry.

@misc{zhang2025ICEdit,
      title={In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer}, 
      author={Zechuan Zhang and Ji Xie and Yu Lu and Zongxin Yang and Yi Yang},
      year={2025},
      eprint={2504.20690},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2504.20690}, 
}

Information Table

Property	Details
Model Type	In-Context Edit
Training Data	osunlp/MagicBrush, TIGER-Lab/OmniEdit-Filtered-1.2M
Base Model	black-forest-labs/FLUX.1-Fill-dev
Pipeline Tag	image-to-image
Library Name	diffusers
Tags	art

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご