Sapiens-Seg-1B-bfloat16 Open-Source Vision Model - Focused on Human-Centric Vision Task Processing

Sapiens Seg 1b Bfloat16

Developed by facebook

Sapiens is a Vision Transformer model pre-trained on 300 million high-resolution human images, specializing in human-centric vision tasks

Image Segmentation English#High-Resolution Human Segmentation #28-Part Recognition #Synthetic Data Generalization

Downloads 42

Release Time : 9/10/2024

Model Overview

This model performs 28-class human body part segmentation, supports 1K high-resolution inference, and demonstrates exceptional generalization in real-world scenarios

Model Features

High-Resolution Support

Natively supports 1024x1024 resolution input, ideal for high-precision segmentation tasks

Large-Scale Pre-training

Pre-trained on 300 million human images, learning rich visual features

Real-World Generalization

Maintains strong performance on real data even with scarce annotations or fully synthetic conditions

Efficient Inference

Optimized with bfloat16 format to balance accuracy and computational efficiency

Model Capabilities

Human body part segmentation

High-resolution image processing

Multi-class semantic segmentation

Use Cases

Medical Imaging

Surgical Planning Assistance

Used for precise segmentation of human anatomy pre-surgery

Provides accurate segmentation results for 28 body parts

Virtual Reality

Virtual Avatar Creation

Used for generating high-fidelity body part segmentation for virtual characters

Supports realistic virtual avatar body part recognition

Property	Details
Image Size	1024 x 768 (H x W)
Num Parameters	1.169 B
FLOPs	4.647 TFLOPs
Patch Size	16 x 16
Embedding Dimensions	1536
Num Layers	40
Num Heads	24
Feedforward Channels	6144

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Sapiens Seg 1b Bfloat16

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Seg-Sapiens-1B-Bfloat16

✨ Features

📦 Installation

📚 Documentation

Model Details

Model Card

More Resources

📄 License