S

Sapiens Depth 2b

Developed by facebook
Sapiens is a family of vision Transformer models pre-trained on 300 million 1024×1024 resolution human images, focusing on human-centric vision tasks.
Downloads 40
Release Time : 9/10/2024

Model Overview

Depth-Sapiens-2B is a vision Transformer model for estimating relative depth in human images, natively supporting 1K high-resolution inference and demonstrating exceptional generalization to real data even with scarce annotations or fully synthetic training.

Model Features

High-resolution support
Natively supports 1K high-resolution inference with image sizes up to 1024×768.
Large-scale pre-training
Pre-trained on 300 million 1024×1024 resolution human images.
Exceptional generalization
Demonstrates outstanding generalization to real data even with scarce annotations or fully synthetic training.
Efficient architecture
Utilizes vision Transformer architecture with 2.163 billion parameters and 8.709 trillion FLOPs computational cost.

Model Capabilities

Human image depth estimation
High-resolution image processing
Synthetic data generalization

Use Cases

Computer vision
Human depth estimation
Estimates relative depth in human images, suitable for virtual reality, augmented reality, and similar scenarios.
Demonstrates exceptional generalization to real data even with scarce annotations or fully synthetic training.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase