Marigold - Normals - v1 - 1 Open - source Model - Accurately Predict Surface Normal Maps from a Single Image

Marigold Normals V1 1

Developed by prs-eth

A monocular normal estimation model fine-tuned from a stable diffusion model, capable of predicting surface normal maps from a single image

3D Vision English#Monocular Normal Estimation #Diffusion Model Fine-tuning #Zero-shot Learning

Downloads 1,850

Release Time : 10/21/2024

Model Overview

This model is used for monocular normal estimation from a single image, fine-tuned from the stable-diffusion-2 model, and can generate estimated surface normal maps and uncertainty maps for input images.

Model Features

High-Resolution Processing

The model inherits the base diffusion model's effective resolution of approximately 768 pixels, making it suitable for processing high-quality images.

Zero-shot Learning

Capable of handling various natural scene images without specific training.

Uncertainty Estimation

Can generate uncertainty maps to help assess the reliability of prediction results.

Flexible Inference

Supports 1 to 50 denoising steps, allowing a balance between speed and accuracy based on requirements.

Model Capabilities

Image normal estimation

Computer vision analysis

Natural scene processing

Uncertainty quantification

Use Cases

Computer Vision

3D Scene Reconstruction

Estimating surface normals from a single image to assist in 3D scene reconstruction.

Generated normal maps can be used in subsequent 3D modeling workflows.

Augmented Reality

Providing surface normal information for AR applications.

Improves lighting and shadow effects of virtual objects in real-world scenes.

Industrial Inspection

Surface Defect Detection

Detecting surface anomalies through normal maps.

Enhances the accuracy of automated inspections.

🚀 Marigold Normals v1-1 Model Card

This model card is for the marigold-normals-v1-1 model, which is designed for monocular normals estimation from a single image. It offers a solution for generating estimated surface normals maps, contributing to image analysis and computer vision tasks.

🚀 Quick Start

Interactive Demo: Explore the Hugging Face Spaces demo. You can test the model with example images or upload your own.
Code Integration: Use diffusers to compute results with just a few lines of code.
In - depth Exploration: Check out the official codebase to understand the model better.

✨ Features

Single - Image Analysis: Capable of generating an estimated surface normals map from a single input image.
Resolution Adaptability: Can process images of any resolution, with optimal performance at around 768 pixels on the longer side.
Flexible Scheduler: Designed for use with the DDIM scheduler and 1 - 50 denoising steps.
Uncertainty Map: Can produce an uncertainty map when multiple predictions are ensembled with an ensemble size larger than 2.

📦 Installation

The original README does not provide installation steps, so this section is skipped.

💻 Usage Examples

Basic Usage

You can use the model through the Hugging Face Spaces demo:

# Visit the following link to use the interactive demo
# https://huggingface.co/spaces/prs-eth/marigold-normals

Advanced Usage

For more advanced usage, integrate the model with diffusers:

# Refer to the official diffusers documentation for detailed code examples
# https://huggingface.co/docs/diffusers/using-diffusers/marigold_usage

📚 Documentation

Model Details

Property	Details
Developed by	Bingxin Ke, Kevin Qu, Tianfu Wang, Nando Metzger, Shengyu Huang, Bo Li, Anton Obukhov, Konrad Schindler
Model Type	Generative latent diffusion - based normals estimation from a single image.
Language	English
License	CreativeML Open RAIL++ - M License
Model Description	This model can be used to generate an estimated surface normals map of an input image. - Resolution: The model inherits the base diffusion model's effective resolution of roughly 768 pixels. Resize larger input images to make the longer side 768 pixels for optimal predictions. - Steps and scheduler: Designed for use with DDIM scheduler and 1 - 50 denoising steps. - Outputs: - Surface normals map: Predicted values are 3 - dimensional unit vectors in the screen space camera. - Uncertainty map: Produced only when multiple predictions are ensembled with ensemble size larger than 2.
Resources for more information	Project Website, Paper, Code

Cite as

@misc{ke2025marigold,
  title={Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis},
  author={Bingxin Ke and Kevin Qu and Tianfu Wang and Nando Metzger and Shengyu Huang and Bo Li and Anton Obukhov and Konrad Schindler},
  year={2025},
  eprint={2505.09358},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@InProceedings{ke2023repurposing,
  title={Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation},
  author={Bingxin Ke and Anton Obukhov and Shengyu Huang and Nando Metzger and Rodrigo Caye Daudt and Konrad Schindler},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2024}
}

🔧 Technical Details

The original README does not provide specific technical details (more than 50 words), so this section is skipped.

📄 License

This model is released under the CreativeML Open RAIL++ - M License.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご