# Real-world scenario generalization
Sapiens Depth 1b Torchscript
Sapiens is a vision transformer series model pre-trained on 300 million 1024 x 1024 resolution human images, focusing on human-centric vision tasks.
3D Vision English
S
facebook
160
0
Sapiens Seg Foreground 1b Torchscript
Sapiens is a vision transformer model pre-trained on 300 million high-resolution human images, specifically designed for foreground person segmentation tasks.
Image Segmentation English
S
facebook
25
1
Sapiens Seg 0.3b Torchscript
Sapiens is a family of vision Transformer models pre-trained on 300 million 1024 x 1024 resolution human images, supporting 1K high-resolution inference, demonstrating exceptional generalization to real-world data even with scarce or entirely synthetic labeled data.
Image Segmentation English
S
facebook
56
0
Featured Recommended AI Models