VGGT 1B
VGGT is a feedforward neural network capable of inferring all key 3D attributes from one, several, or hundreds of views of a scene within seconds.
Downloads 196.31k
Release Time : 3/11/2025
Model Overview
Visual Geometry Foundation Transformer (VGGT) is a neural network that can rapidly infer 3D attributes from single or multiple views, including camera parameters, point clouds, depth maps, and 3D point trajectories.
Model Features
Multi-view 3D Reconstruction
Capable of rapidly inferring 3D attributes from single or multiple views
Fast Inference
Completes 3D attribute inference within seconds
Comprehensive 3D Attribute Output
Simultaneously outputs camera parameters, point clouds, depth maps, and 3D point trajectories
Model Capabilities
3D Scene Reconstruction
Camera Parameter Estimation
Depth Map Generation
Point Cloud Generation
3D Point Trajectory Prediction
Use Cases
Computer Vision
Augmented Reality
Quickly generates 3D scenes from 2D images for AR applications
Robotic Navigation
Provides robots with environmental 3D understanding capabilities
Film Production
Rapid 3D Scene Modeling
Quickly generates 3D scene models from footage
Featured Recommended AI Models
Š 2025AIbase