S

Sapiens Pretrain 1b Torchscript

Developed by facebook
Sapiens is a family of vision Transformers pre-trained on 300 million 1024x1024 resolution human images, specifically designed for human-centric vision tasks.
Downloads 35
Release Time : 9/9/2024

Model Overview

Sapiens-1B is a high-resolution vision Transformer model, pre-trained on large-scale human images, suitable for feature extraction and fine-tuning tasks, particularly excelling in scenarios with scarce labeled data or completely synthetic conditions.

Model Features

High-resolution support
Native support for 1K high-resolution (1024x1024) image processing
Large-scale pre-training
Pre-trained on 300 million human images with powerful feature extraction capabilities
Real-world generalization
Demonstrates exceptional generalization to real data even with scarce labeled data or completely synthetic conditions
Efficient architecture
Utilizes a 40-layer vision Transformer architecture with 1536 embedding dimensions and 24 attention heads

Model Capabilities

High-resolution image processing
Human image feature extraction
Visual representation learning
Transfer learning

Use Cases

Computer vision
Human image analysis
Used for human-centric vision tasks such as pose estimation and action recognition
Demonstrates exceptional generalization in real-world scenarios
Feature extraction
Serves as a pre-trained model for extracting image features for downstream tasks
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase