Sapiens Pose 1b Bfloat16
Sapiens is a vision transformer series model pre-trained on 300 million 1024x1024 resolution human images, focusing on human-centric vision tasks.
Downloads 31
Release Time : 9/10/2024
Model Overview
This model estimates 308 keypoints (body + face + hands + feet) on a single image, supports 1K high-resolution inference, and exhibits exceptional generalization capabilities.
Model Features
High-resolution support
Natively supports 1K high-resolution inference, suitable for image sizes of 1024x768.
Large-scale pre-training
Pre-trained on 300 million human images, featuring powerful feature extraction capabilities.
Multi-keypoint detection
Capable of detecting 308 keypoints simultaneously for the body, face, hands, and feet.
Exceptional generalization
Demonstrates outstanding generalization to real-world data even with scarce labeled data or fully synthetic scenarios.
Model Capabilities
Human pose estimation
Facial keypoint detection
Hand keypoint detection
Foot keypoint detection
Use Cases
Computer vision
Human pose analysis
Used for human pose estimation in scenarios like motion analysis and fitness guidance.
Detects 308 keypoints, providing detailed human pose information.
Virtual reality
Enables precise human motion capture in VR/AR applications.
High-precision keypoint detection enhances virtual reality experiences.
Healthcare
Rehabilitation training monitoring
Monitors whether patients' rehabilitation training movements are standardized.
Featured Recommended AI Models