Sapiens Depth 0.6b Torchscript
Sapiens is a vision transformer series model pre-trained on 300 million 1024 x 1024 resolution human images, focusing on human-centric vision tasks.
Downloads 34
Release Time : 9/9/2024
Model Overview
This model is used to estimate the relative depth of human images, supports high-resolution inference, and demonstrates exceptional generalization capabilities on real data.
Model Features
High-resolution support
Natively supports 1K high-resolution inference, suitable for high-quality image processing.
Exceptional generalization capability
Demonstrates outstanding generalization on real data even with scarce labeled data or fully synthetic scenarios.
Large-scale pre-training
Pre-trained on 300 million 1024 x 1024 resolution human images, featuring powerful feature extraction capabilities.
Model Capabilities
Human image depth estimation
High-resolution image processing
Use Cases
Computer vision
Human depth estimation
Used to estimate the relative depth of human images, applicable in virtual reality, augmented reality, and other scenarios.
Demonstrates exceptional generalization capabilities on real data.
Featured Recommended AI Models
Š 2025AIbase