Sapiens Depth 2b Torchscript
Sapiens is a vision Transformer model pre-trained on 300 million 1024Ã1024 resolution human images, specifically designed for human-centric vision tasks with exceptional generalization capabilities.
Downloads 58
Release Time : 9/9/2024
Model Overview
This model is used for relative depth estimation of human images, natively supporting 1K high-resolution inference and maintaining good performance even with scarce annotated data or fully synthetic scenarios.
Model Features
High-Resolution Support
Native support for 1K high-resolution (1024Ã768) inference
Strong Generalization Capability
Demonstrates exceptional generalization to real data even with scarce annotations or fully synthetic scenarios
Large-Scale Pretraining
Pretrained on 300 million 1024Ã1024 resolution human images
Model Capabilities
Human Image Depth Estimation
High-Resolution Image Processing
Use Cases
Computer Vision
Human Depth Estimation
Estimating relative depth information from a single human image
Can generate precise depth maps
Featured Recommended AI Models
Š 2025AIbase