Sapiens Pretrain 0.6b
Sapiens is a Vision Transformer model pre-trained on 300 million 1024Ã1024 resolution human images, excelling in human-centric vision tasks.
Downloads 13
Release Time : 9/10/2024
Model Overview
A 600-million parameter Vision Transformer model with native support for 1K high-resolution inference, demonstrating exceptional generalization capabilities on real data even with scarce annotations or fully synthetic data.
Model Features
High-resolution support
Native support for 1024Ã1024 resolution image processing
Data efficiency
Maintains good generalization even with scarce annotations or fully synthetic data
Large-scale pretraining
Pre-trained on 300 million human images
Model Capabilities
Human image feature extraction
High-resolution image processing
Visual representation learning
Use Cases
Computer vision
Human pose estimation
Extracts human pose features from high-resolution images
Virtual avatar generation
Used for generating realistic digital human avatars
Featured Recommended AI Models
Š 2025AIbase