🚀 DepthMaster: Taming Diffusion Models for Monocular Depth Estimation
DepthMaster is a tamed single - step diffusion model that customizes generative features in diffusion models for monocular depth estimation. It shows excellent zero - shot performance and detail preservation ability.
🚀 Quick Start
This repository represents the official implementation of the paper titled "DepthMaster: Taming Diffusion Models for Monocular Depth Estimation".
Ziyang Song*,
Zerong Wang*,
Bo Li,
Hao Zhang,
Ruijie Zhu,
Li Liu,
Peng-Tao Jiang†,
Tianzhu Zhang†,
*Equal Contribution, †Corresponding Author
University of Science and Technology of China, vivo Mobile Communication Co., Ltd.
Arxiv 2025

We present DepthMaster, a tamed single-step diffusion model that customizes generative features in diffusion models to suit the discriminative depth estimation task. We introduce a Feature Alignment module to mitigate overfitting to texture and a Fourier Enhancement module to refine fine-grained details. DepthMaster exhibits state-of-the-art zero-shot performance and superior detail preservation ability, surpassing other diffusion-based methods across various datasets.
✨ Features
- Customized Feature Adaptation: DepthMaster customizes generative features in diffusion models to fit the discriminative depth estimation task.
- Overfitting Mitigation: The Feature Alignment module helps mitigate overfitting to texture details.
- Detail Refinement: The Fourier Enhancement module is used to refine fine - grained details.
- Superior Performance: It shows state - of - the - art zero - shot performance and excellent detail preservation ability across various datasets.
📚 Documentation
Model Information
Property |
Details |
Base Model |
stabilityai/stable-diffusion-2 |
Pipeline Tag |
depth - estimation |
🎓 Citation
Please cite our paper:
@article{song2025depthmaster,
title={DepthMaster: Taming Diffusion Models for Monocular Depth Estimation},
author={Song, Ziyang and Wang, Zerong and Li, Bo and Zhang, Hao and Zhu, Ruijie and Liu, Li and Jiang, Peng-Tao and Zhang, Tianzhu},
journal={arXiv preprint arXiv:2501.02576},
year={2025}
}
Acknowledgements
The code is based on Marigold.
📄 License
This work is licensed under the Apache License, Version 2.0 (as defined in the LICENSE).
By downloading and using the code and model you agree to the terms in the LICENSE.
