đ Marigold Normals v1-1 Model Card
This model card is for the marigold-normals-v1-1
model, which is designed for monocular normals estimation from a single image. It offers a solution for generating estimated surface normals maps, contributing to image analysis and computer vision tasks.
đ Quick Start
- Interactive Demo: Explore the Hugging Face Spaces demo. You can test the model with example images or upload your own.
- Code Integration: Use diffusers to compute results with just a few lines of code.
- In - depth Exploration: Check out the official codebase to understand the model better.
⨠Features
- Single - Image Analysis: Capable of generating an estimated surface normals map from a single input image.
- Resolution Adaptability: Can process images of any resolution, with optimal performance at around 768 pixels on the longer side.
- Flexible Scheduler: Designed for use with the DDIM scheduler and 1 - 50 denoising steps.
- Uncertainty Map: Can produce an uncertainty map when multiple predictions are ensembled with an ensemble size larger than 2.
đĻ Installation
The original README does not provide installation steps, so this section is skipped.
đģ Usage Examples
Basic Usage
You can use the model through the Hugging Face Spaces demo:
Advanced Usage
For more advanced usage, integrate the model with diffusers
:
đ Documentation
Model Details
Property |
Details |
Developed by |
Bingxin Ke, Kevin Qu, Tianfu Wang, Nando Metzger, Shengyu Huang, Bo Li, Anton Obukhov, Konrad Schindler |
Model Type |
Generative latent diffusion - based normals estimation from a single image. |
Language |
English |
License |
CreativeML Open RAIL++ - M License |
Model Description |
This model can be used to generate an estimated surface normals map of an input image. - Resolution: The model inherits the base diffusion model's effective resolution of roughly 768 pixels. Resize larger input images to make the longer side 768 pixels for optimal predictions. - Steps and scheduler: Designed for use with DDIM scheduler and 1 - 50 denoising steps. - Outputs: - Surface normals map: Predicted values are 3 - dimensional unit vectors in the screen space camera. - Uncertainty map: Produced only when multiple predictions are ensembled with ensemble size larger than 2. |
Resources for more information |
Project Website, Paper, Code |
Cite as
@misc{ke2025marigold,
title={Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis},
author={Bingxin Ke and Kevin Qu and Tianfu Wang and Nando Metzger and Shengyu Huang and Bo Li and Anton Obukhov and Konrad Schindler},
year={2025},
eprint={2505.09358},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
@InProceedings{ke2023repurposing,
title={Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation},
author={Bingxin Ke and Anton Obukhov and Shengyu Huang and Nando Metzger and Rodrigo Caye Daudt and Konrad Schindler},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2024}
}
đ§ Technical Details
The original README does not provide specific technical details (more than 50 words), so this section is skipped.
đ License
This model is released under the CreativeML Open RAIL++ - M License.