Sapiens Depth 0.3b Bfloat16
Sapiens is a series of vision transformer models pre-trained on 300 million human images at 1024x1024 resolution, focusing on human-centric vision tasks.
Downloads 22
Release Time : 9/10/2024
Model Overview
This model is used to estimate relative depth information in human images, supports 1K high-resolution inference, and demonstrates exceptional generalization capabilities for real-world data.
Model Features
High-resolution support
Natively supports 1K high-resolution inference, with image sizes up to 1024x768.
Strong generalization capability
Demonstrates exceptional generalization to real-world data even with scarce labeled data or fully synthetic scenarios.
Efficient computation
Computational load of 1.242 TFLOPs with 336 million parameters, balancing performance and efficiency.
Model Capabilities
Depth estimation
High-resolution image processing
Human image analysis
Use Cases
Computer vision
Human image depth estimation
Used to estimate relative depth information in human images, suitable for virtual reality, augmented reality, and similar scenarios.
Demonstrates outstanding generalization capabilities in complex scenes.
Featured Recommended AI Models
Š 2025AIbase