D

Dpt Beit Base 384

Developed by Intel
DPT is a dense prediction transformer model based on the BEiT backbone network, designed for monocular depth estimation and trained on 1.4 million images.
Downloads 25.98k
Release Time : 11/28/2023

Model Overview

This model is a vision transformer architecture specifically designed for predicting depth information from a single image. It employs BEiT as the backbone network and incorporates a specialized head structure for depth estimation.

Model Features

BEiT Backbone Network
Leverages the powerful feature extraction capabilities of the BEiT pre-trained model
Zero-shot Depth Estimation
Capable of depth prediction without fine-tuning for specific scenes
High-resolution Output
Generates depth maps that match the resolution of the input image

Model Capabilities

Monocular Depth Estimation
Image Depth Prediction
3D Scene Understanding

Use Cases

Computer Vision
3D Scene Reconstruction
Reconstructs 3D scene depth information from a single image
Generates depth maps with the same resolution as the input image
Augmented Reality
Provides scene depth information for AR applications
Robotic Navigation
Offers environmental depth perception for autonomous mobile robots
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase