Sapiens Depth 0.6b
Sapiens is a family of Vision Transformer models pre-trained on 300 million 1024x1024 resolution human images, specializing in human-centric vision tasks.
Downloads 19
Release Time : 9/10/2024
Model Overview
This model is used for relative depth estimation of human images, supporting 1K high-resolution inference and excelling in real-world scenarios.
Model Features
High-Resolution Support
Natively supports 1K high-resolution inference, suitable for human images at 1024x1024 resolution.
Strong Generalization Capability
Demonstrates excellent generalization to real-world data even with scarce labeled data or fully synthetic conditions.
Large-Scale Pre-training
Pre-trained on 300 million human images, equipped with powerful feature extraction capabilities.
Model Capabilities
Human Image Depth Estimation
High-Resolution Image Processing
Use Cases
Computer Vision
Human Depth Estimation
Used to estimate relative depth information of human images, applicable in virtual reality, augmented reality, and other scenarios.
Performs excellently in real-world conditions
Featured Recommended AI Models
Š 2025AIbase