Mask2former Swin Large Cityscapes Panoptic
Mask2Former model based on Swin backbone network, specifically optimized and trained for panoptic segmentation tasks on the Cityscapes dataset
Downloads 772
Release Time : 1/3/2023
Model Overview
Mask2Former is a universal image segmentation model that uses a unified framework to handle instance segmentation, semantic segmentation, and panoptic segmentation tasks. By predicting a set of masks and their corresponding labels, it unifies these three tasks as instance segmentation problems.
Model Features
Unified Segmentation Framework
Unifies instance segmentation, semantic segmentation, and panoptic segmentation as instance segmentation problems
Efficient Attention Mechanism
Uses multi-scale deformable attention Transformer to upgrade the pixel decoder, improving computational efficiency
Masked Attention Decoder
Introduces Transformer decoder with masked attention mechanism, enhancing performance without increasing computational load
Efficient Training Strategy
Significantly improves training efficiency by calculating loss values through subsampled points
Model Capabilities
Image Segmentation
Panoptic Segmentation
Instance Segmentation
Semantic Segmentation
Use Cases
Autonomous Driving
Street Scene Understanding
Identify and segment various objects and regions in urban street scenes
Can be used in the environmental perception module of autonomous driving systems
Intelligent Surveillance
Scene Analysis
Accurately segment and identify objects in surveillance videos
Enhances the intelligent analysis capabilities of surveillance systems
Featured Recommended AI Models
Š 2025AIbase