Sapiens Pose 0.3b Torchscript
Sapiens is a vision Transformer model pre-trained on 300 million high-resolution human images, specifically designed for pose estimation tasks, supporting 308 keypoint detection.
Downloads 55
Release Time : 9/13/2024
Model Overview
This model is used for full-body keypoint (body + face + hands + feet) estimation in single images, performing excellently at 1024x768 resolution.
Model Features
High-resolution support
Natively supports 1024x768 high-resolution input, suitable for fine-grained pose analysis
Multi-part keypoint detection
Simultaneously detects 308 keypoints across body, face, hands, and feet
Strong generalization capability
Pre-trained on 300 million images, demonstrating excellent performance in real-world scenarios
Efficient inference
1.242 trillion FLOPs computational load, balancing accuracy and efficiency
Model Capabilities
Full-body pose estimation
Multi-part keypoint detection
High-resolution image processing
Use Cases
Motion analysis
Sports pose analysis
Used for athlete motion capture and posture correction
Can accurately identify 308 keypoints
Human-computer interaction
Gesture recognition
Recognizes complex hand movements
Includes hand keypoint detection
Featured Recommended AI Models
Š 2025AIbase