The marigold - depth - hr - v1 - 1 open - source monocular depth estimation model supports high

Marigold Depth Hr V1 1

Developed by prs-eth

A monocular depth estimation model based on the latent diffusion model, supporting high-resolution image processing

3D Vision EnglishOpen Source License:Apache-2.0 #High-resolution depth estimation #Monocular image processing #Diffusion model fine-tuning

Downloads 257

Release Time : 1/15/2025

Model Overview

This model is used for monocular depth estimation from a single image. It is fine-tuned based on marigold-depth-v1-0 and supports high-resolution inputs up to 4MP

Model Features

High-resolution support

Designed to support large-resolution image processing up to 4MP

Affine-invariant depth map

The predicted values are between 0 and 1, interpolated between the near and far planes selected by the model

Efficient inference

Designed to work with the DDIM scheduler, with the number of denoising steps between 10 and 50

Model Capabilities

Monocular depth estimation

High-resolution image processing

Depth map generation

Use Cases

Computer vision

3D scene reconstruction

Estimate depth information from a single image for 3D scene reconstruction

Generate an affine-invariant depth map

Augmented reality

Provide fast depth estimation for AR applications

🚀 High-Resolution Marigold Depth v1-0 Model Card

This model card presents the marigold-depth-hr-v1-0 model, which is designed for monocular depth estimation from a single image. It offers high - resolution depth estimation capabilities, contributing to image analysis and computer vision fields.

This is a model card for the marigold-depth-hr-v1-0 model for monocular depth estimation from a single image. The model is fine - tuned from the marigold-depth-v1-0 model as described in our papers:

CVPR'2024 paper titled "Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation"
Journal extension titled "Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis"

📚 Documentation

Model Details

Property	Details
Developed by	Bingxin Ke, Kevin Qu, Tianfu Wang, Nando Metzger, Shengyu Huang, Bo Li, Anton Obukhov, Konrad Schindler.
Model type	Generative latent diffusion-based affine-invariant monocular depth estimation from a single image.
Language	English.
License	Apache License License Version 2.0.
Model Description	This model can be used to generate an estimated depth map of an input image. - Resolution: The model is designed to support large resolutions up to 4MP. - Steps and scheduler: This model was designed for usage with the DDIM scheduler and between 10 and 50 denoising steps. - Outputs: - Affine-invariant depth map: The predicted values are between 0 and 1, interpolating between the near and far planes of the model's choice.
Resources for more information	Project Website, Paper, Code.

Cite as

@misc{ke2025marigold,
  title={Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis},
  author={Bingxin Ke and Kevin Qu and Tianfu Wang and Nando Metzger and Shengyu Huang and Bo Li and Anton Obukhov and Konrad Schindler},
  year={2025},
  eprint={2505.09358},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

@InProceedings{ke2023repurposing,
  title={Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation},
  author={Bingxin Ke and Anton Obukhov and Shengyu Huang and Nando Metzger and Rodrigo Caye Daudt and Konrad Schindler},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2024}
}

📄 License

This model is released under the Apache License License Version 2.0.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご