Dust3r ViTLarge BaseDecoder 224 Linear
DUSt3R is a model for easily achieving geometric 3D vision from images, capable of reconstructing 3D scenes from single or multiple images.
Downloads 1,829
Release Time : 6/19/2024
Model Overview
DUSt3R is a 3D vision model based on the ViT architecture, focusing on recovering 3D geometric information from 2D images. It employs an asymmetric CroCo3DStereo architecture, capable of processing single-view or multi-view inputs to output the geometric structure of 3D scenes.
Model Features
Single-view and Multi-view 3D Reconstruction
Capable of reconstructing 3D geometric structures from single or multiple images.
Efficient ViT Architecture
Uses Vision Transformer architecture, combining a large encoder with a small decoder.
High-resolution Processing
Supports input resolution of 224x224 pixels.
Model Capabilities
3D Scene Reconstruction
Geometric Shape Recovery
Depth Estimation
Point Cloud Generation
Use Cases
Computer Vision
3D Scene Reconstruction
Reconstruct 3D scenes from single or multiple 2D images.
Generates 3D geometric structures and depth information of the scene.
Augmented Reality
Provides 3D environmental understanding for AR applications.
Robotic Vision
Environmental Perception
Helps robots understand the 3D structure of their surroundings.
Featured Recommended AI Models
Š 2025AIbase