dpt-dinov2-giant-kitti開源模型 - 用於深度估計任務的實用工具

首頁

Dpt Dinov2 Giant Kitti

由facebook開發

使用DINOv2作為骨幹網絡的DPT框架，用於深度估計任務。

3D視覺

Transformers

開源協議:Apache-2.0 #深度估計 #無監督學習 #視覺特徵提取

下載量 56

發布時間 : 11/1/2023

模型概述

該模型結合了DINOv2的無監督視覺特徵學習能力和DPT的密集預測變換器架構，專注於深度估計任務。

模型特點

DINOv2骨幹網絡

採用無監督學習的DINOv2作為骨幹網絡，提供強大的視覺特徵提取能力。

密集預測變換器

使用DPT架構進行密集預測任務，特別適合深度估計。

高精度深度估計

能夠從單張圖像生成高質量的深度圖。

模型能力

單圖像深度估計

視覺特徵提取

使用案例

計算機視覺

3D場景重建

從2D圖像估計深度信息，用於3D場景重建。

生成精確的深度圖

自動駕駛

用於自動駕駛系統中的環境感知和距離估計。

🚀 帶有DINOv2主幹的DPT模型

本項目提供了一個帶有DINOv2主幹的DPT（Dense Prediction Transformer）模型，可用於強大的深度估計任務。該模型結合了DPT框架和DINOv2的優勢，為視覺領域的深度估計問題提供了有效的解決方案。

🚀 快速開始

使用Transformers庫調用模型

以下是使用transformers庫調用該模型進行深度估計的示例代碼：

from transformers import AutoImageProcessor, DPTForDepthEstimation
import torch
import numpy as np
from PIL import Image
import requests

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

image_processor = AutoImageProcessor.from_pretrained("facebook/dpt-dinov2-giant-kitti")
model = DPTForDepthEstimation.from_pretrained("facebook/dpt-dinov2-giant-kitti")

# prepare image for the model
inputs = image_processor(images=image, return_tensors="pt")

with torch.no_grad():
    outputs = model(**inputs)
    predicted_depth = outputs.predicted_depth

# interpolate to original size
prediction = torch.nn.functional.interpolate(
    predicted_depth.unsqueeze(1),
    size=image.size[::-1],
    mode="bicubic",
    align_corners=False,
)

# visualize the prediction
output = prediction.squeeze().cpu().numpy()
formatted = (output * 255 / np.max(output)).astype("uint8")
depth = Image.fromarray(formatted)

✨ 主要特性

強大的深度估計能力：使用DPT框架和DINOv2作為主幹，能夠實現高效準確的深度估計。
易於使用：可通過transformers庫輕鬆調用，方便集成到各種項目中。

📚 詳細文檔

模型詳情

DPT（Dense Prediction Transformer）模型採用了DINOv2作為主幹，該模型由Oquab等人在論文DINOv2: Learning Robust Visual Features without Supervision中提出。

DPT架構。取自原始論文。

參考資源

模型使用

預期用途

該模型旨在展示使用DPT框架並以DINOv2作為主幹可以得到一個強大的深度估計器。

BibTeX引用信息

@misc{oquab2023dinov2,
      title={DINOv2: Learning Robust Visual Features without Supervision}, 
      author={Maxime Oquab and Timothée Darcet and Théo Moutakanni and Huy Vo and Marc Szafraniec and Vasil Khalidov and Pierre Fernandez and Daniel Haziza and Francisco Massa and Alaaeldin El-Nouby and Mahmoud Assran and Nicolas Ballas and Wojciech Galuba and Russell Howes and Po-Yao Huang and Shang-Wen Li and Ishan Misra and Michael Rabbat and Vasu Sharma and Gabriel Synnaeve and Hu Xu and Hervé Jegou and Julien Mairal and Patrick Labatut and Armand Joulin and Piotr Bojanowski},
      year={2023},
      eprint={2304.07193},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}