Sapiens Depth 0.3b Torchscript
Sapiens is a family of vision transformers pre-trained on 300 million 1024 x 1024 resolution human images for depth estimation tasks.
Downloads 69
Release Time : 9/9/2024
Model Overview
Sapiens-0.3B is a vision transformer model specifically designed for estimating relative depth in human images. The model excels at 1K high resolution and can generalize to real-world scenarios.
Model Features
High-resolution support
Natively supports 1K high-resolution inference, suitable for high-precision depth estimation tasks.
Strong generalization capability
Demonstrates excellent generalization to real-world data even with scarce or entirely synthetic labeled data.
Large-scale pre-training
Pre-trained on 300 million 1024 x 1024 resolution human images, featuring powerful feature extraction capabilities.
Model Capabilities
Human image depth estimation
High-resolution image processing
Real-world scenario generalization
Use Cases
Computer vision
Virtual reality
Used for human depth estimation in virtual reality applications to enhance scene realism.
Augmented reality
Accurately estimates human depth in augmented reality applications for more natural interactions.
Featured Recommended AI Models
Š 2025AIbase