Prompt Depth Anything開源模型 - 高分辨率精確度量深度估計，可達4K分辨率！

首頁

Prompt Depth Anything Vitl Hf

由depth-anything開發

Prompt Depth Anything 是一種高分辨率且精確的度量深度估計方法，通過提示（prompting）釋放深度基礎模型的潛力，能夠生成高達4K分辨率的精確度量深度。

3D視覺

Transformers

開源協議:Apache-2.0 #高分辨率深度估計 #激光雷達提示 #4K度量深度

下載量 241

發布時間 : 12/23/2024

模型概述

該模型利用iPhone激光雷達作為提示，引導模型生成高分辨率的精確度量深度，適用於3D重建和廣義機器人抓取等下游應用。

模型特點

高分辨率深度估計

能夠生成高達4K分辨率的精確度量深度。

提示引導

利用iPhone激光雷達作為提示，引導模型生成精確深度。

可擴展數據流水線

引入了可擴展的數據流水線來訓練該方法。

模型能力

高分辨率深度估計

精確度量深度

3D重建

機器人抓取

使用案例

3D重建

3D場景重建

利用高分辨率深度估計進行3D場景重建。

提升重建精度和分辨率。

機器人

廣義機器人抓取

利用精確深度估計提升機器人抓取效果。

提高抓取成功率和精度。

🚀 Prompt-Depth-Anything-Vitl

Prompt Depth Anything 是一種高分辨率且準確的度量深度估計方法，它能有效解決深度估計領域中分辨率和精度不足的問題，為 3D 重建和機器人抓取等下游應用提供有力支持。

🚀 快速開始

本模型與 Hugging Face Transformers 兼容，具體使用方法可參考文檔。

💻 使用示例

基礎用法

import requests
from PIL import Image
from transformers import PromptDepthAnythingForDepthEstimation, PromptDepthAnythingImageProcessor

url = "https://github.com/DepthAnything/PromptDA/blob/main/assets/example_images/image.jpg?raw=true"
image = Image.open(requests.get(url, stream=True).raw)

image_processor = PromptDepthAnythingImageProcessor.from_pretrained("depth-anything/prompt-depth-anything-vitl-hf")
model = PromptDepthAnythingForDepthEstimation.from_pretrained("depth-anything/prompt-depth-anything-vitl-hf")

prompt_depth_url = "https://github.com/DepthAnything/PromptDA/blob/main/assets/example_images/arkit_depth.png?raw=true"
prompt_depth = Image.open(requests.get(prompt_depth_url, stream=True).raw)

inputs = image_processor(images=image, return_tensors="pt", prompt_depth=prompt_depth)
with torch.no_grad():
    outputs = model(**inputs)
post_processed_output = image_processor.post_process_depth_estimation(
    outputs,
    target_sizes=[(image.height, image.width)],
)

predicted_depth = post_processed_output[0]["predicted_depth"]

✨ 主要特性

受視覺語言模型（VLM）和大語言模型（LLM）中提示技術成功的啟發，採用提示技術釋放深度基礎模型的潛力。
以廣泛可用的 iPhone LiDAR 作為提示，引導模型生成高達 4K 分辨率的準確度量深度。
引入可擴展的數據管道來訓練該方法。
該方法有利於下游應用，包括 3D 重建和通用機器人抓取。

📄 許可證

本項目採用 Apache-2.0 許可證。

📚 詳細文檔

如果您覺得本項目有用，請考慮引用以下文獻：

@inproceedings{lin2024promptda,
  title={Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation},
  author={Lin, Haotong and Peng, Sida and Chen, Jingxiao and Peng, Songyou and Sun, Jiaming and Liu, Minghuan and Bao, Hujun and Feng, Jiashi and Zhou, Xiaowei and Kang, Bingyi},
  journal={arXiv},
  year={2024}
}