Mask2former Swin Base IN21k Coco Instance
Mask2Former is a Transformer-based universal image segmentation model, fine-tuned on the COCO dataset for instance segmentation tasks
Downloads 26
Release Time : 1/16/2023
Model Overview
Adopts a unified architecture to handle instance/semantic/panoptic segmentation tasks, achieving high-performance segmentation through predicting mask groups and their corresponding labels
Model Features
Unified Segmentation Architecture
Uses the same model architecture to handle three segmentation tasks: instance, semantic, and panoptic
Mask Attention Mechanism
Innovative mask attention Transformer decoder improves performance without increasing computational cost
Efficient Training Strategy
Significantly enhances training efficiency by computing loss through sampled points rather than entire masks
Model Capabilities
Image Instance Segmentation
Multi-object Recognition and Segmentation
Complex Scene Parsing
Use Cases
Computer Vision
Object Instance Segmentation
Accurately segments each object instance in an image
Achieves state-of-the-art performance on the COCO dataset
Scene Understanding
Parses objects and their spatial relationships in complex scenes
Featured Recommended AI Models
Š 2025AIbase