VGGT-1B Open-source 3D Attribute Inference Model - Obtain Key 3D Attributes from Multi-views within Seconds

VGGT 1B

Developed by facebook

VGGT is a feedforward neural network capable of inferring all key 3D attributes from one, several, or hundreds of views of a scene within seconds.

3D Vision

Safetensors

English#Multi-view 3D Reconstruction #Geometric Reasoning #Real-time 3D Modeling

Downloads 196.31k

Release Time : 3/11/2025

Model Overview

Visual Geometry Foundation Transformer (VGGT) is a neural network that can rapidly infer 3D attributes from single or multiple views, including camera parameters, point clouds, depth maps, and 3D point trajectories.

Model Features

Multi-view 3D Reconstruction

Capable of rapidly inferring 3D attributes from single or multiple views

Fast Inference

Completes 3D attribute inference within seconds

Comprehensive 3D Attribute Output

Simultaneously outputs camera parameters, point clouds, depth maps, and 3D point trajectories

Model Capabilities

3D Scene Reconstruction

Camera Parameter Estimation

Depth Map Generation

Point Cloud Generation

3D Point Trajectory Prediction

Use Cases

Computer Vision

Augmented Reality

Quickly generates 3D scenes from 2D images for AR applications

Robotic Navigation

Provides robots with environmental 3D understanding capabilities

Film Production

Rapid 3D Scene Modeling

Quickly generates 3D scene models from footage

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

VGGT 1B

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 VGGT: Visual Geometry Grounded Transformer

🚀 Quick Start

✨ Features

📄 License

📚 Documentation

📝 Citation