Mask2former Swin Base IN21k Cityscapes Panoptic
Mask2Former is a general-purpose image segmentation model based on Transformer architecture, capable of handling instance segmentation, semantic segmentation, and panoptic segmentation tasks.
Downloads 140
Release Time : 1/5/2023
Model Overview
This model uses Swin Transformer as the backbone network and is fine-tuned for panoptic segmentation on the Cityscapes dataset. It achieves segmentation tasks by predicting a set of masks and their corresponding labels.
Model Features
Unified Segmentation Paradigm
Unifies instance segmentation, semantic segmentation, and panoptic segmentation as instance segmentation tasks.
Efficient Attention Mechanism
Ups performance with multi-scale deformable attention Transformer and masked attention mechanisms.
Training Optimization
Significantly improves training efficiency by computing loss on subsampled points rather than entire masks.
Model Capabilities
Image Segmentation
Panoptic Segmentation
Instance Segmentation
Semantic Segmentation
Use Cases
Autonomous Driving
Street Scene Understanding
Used for comprehensive understanding of urban street scenes in autonomous driving systems.
Accurately identifies elements such as roads, vehicles, pedestrians, and their relationships.
Urban Mapping
Urban Element Segmentation
Used for urban map creation and updates.
Automatically identifies urban elements such as buildings, roads, and green spaces.
Featured Recommended AI Models