๐ CleanDIFT Model Card
Diffusion models can learn powerful world representations, which are valuable for various tasks. However, they usually require noisy input images. CleanDIFT enables diffusion models to work directly with clean input images, extracting noise - free and timestep - independent features.
๐ Quick Start
Diffusion models learn powerful world representations that have proven valuable for tasks like semantic correspondence detection, depth estimation, semantic segmentation, and classification. However, diffusion models require noisy input images, which destroys information and introduces the noise level as a hyperparameter that needs to be tuned for each task.
We introduce CleanDIFT, a novel method to extract noise - free, timestep - independent features by enabling diffusion models to work directly with clean input images. The approach is efficient, training on a single GPU in just 30 minutes. We publish these models alongside our paper "CleanDIFT: Diffusion Features without Noise".
We provide checkpoints for Stable Diffusion 1.5 and Stable Diffusion 2.1.
โจ Features
- Noise - free Feature Extraction: Enables diffusion models to work with clean input images, extracting noise - free and timestep - independent features.
- Efficiency: Can be trained on a single GPU in just 30 minutes.
- Model Compatibility: Provides checkpoints for Stable Diffusion 1.5 and 2.1, fully compatible with the
diffusers
library.
๐ฆ Installation
No specific installation steps are provided in the original document.
๐ป Usage Examples
Basic Usage
For detailed examples on how to extract features with CleanDIFT and how to use them for downstream tasks, please refer to the notebooks provided here.
Advanced Usage
Our checkpoints are fully compatible with the diffusers
library. If you already have a pipeline using SD 1.5 or SD 2.1 from diffusers
, you can simply replace the U - Net state dict:
from diffusers import UNet2DConditionModel
from huggingface_hub import hf_hub_download
unet = UNet2DConditionModel.from_pretrained("stabilityai/stable-diffusion-2-1", subfolder="unet")
ckpt_pth = hf_hub_download(repo_id="CompVis/cleandift", filename="cleandift_sd21_unet.safetensors")
state_dict = load_file(ckpt_pth)
unet.load_state_dict(state_dict, strict=True)
๐ Documentation
The main documentation can be found in the notebooks provided here.
๐ License
This project is licensed under the MIT license.
๐ Citation
@misc{stracke2024cleandiftdiffusionfeaturesnoise,
title={CleanDIFT: Diffusion Features without Noise},
author={Nick Stracke and Stefan Andreas Baumann and Kolja Bauer and Frank Fundel and Bjรถrn Ommer},
year={2024},
eprint={2412.03439},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2412.03439},
}