Sapiens Depth 0.3b
Sapiens is a Vision Transformer model pre-trained on 300 million high-resolution human images, specializing in human-centric vision tasks.
Downloads 24
Release Time : 9/10/2024
Model Overview
This model is used for relative depth estimation of human images, supports 1K high-resolution inference, and demonstrates exceptional generalization on real-world data.
Model Features
High-Resolution Support
Natively supports 1K high-resolution inference, suitable for image sizes of 1024x768.
Exceptional Generalization
Performs well on real-world data even with scarce annotations or fully synthetic training.
Efficient Computation
Computational cost of 1.242 trillion FLOPs, balancing performance and efficiency.
Model Capabilities
Human Image Depth Estimation
High-Resolution Image Processing
Use Cases
Computer Vision
Human Depth Perception
Used for estimating relative depth in human images, applicable to augmented reality and virtual reality applications.
Demonstrates exceptional generalization in real-world scenarios.
Featured Recommended AI Models
Š 2025AIbase