Mask2former Swin Tiny Cityscapes Instance
Mask2Former is a general-purpose image segmentation model based on Transformer architecture, this version is specifically fine-tuned for instance segmentation tasks on the Cityscapes dataset
Downloads 67
Release Time : 1/5/2023
Model Overview
This model adopts a unified paradigm for image segmentation tasks, achieving instance segmentation by predicting a set of masks and corresponding labels, with improvements in both performance and efficiency compared to previous models
Model Features
Unified Segmentation Architecture
Adopts a unified paradigm to handle instance segmentation, semantic segmentation, and panoptic segmentation tasks, treating all three as instance segmentation
Efficient Attention Mechanism
Uses a multi-scale deformable attention Transformer to replace traditional pixel decoders, improving computational efficiency
Masked Attention Decoder
Employs a Transformer decoder with masked attention to enhance performance without increasing computational load
Efficient Training Strategy
Significantly improves training efficiency by computing losses on sampled points rather than entire masks
Model Capabilities
Image Instance Segmentation
Multi-object Detection and Segmentation
Scene Understanding
Use Cases
Autonomous Driving
Road Scene Analysis
Identify and segment elements such as vehicles, pedestrians, and traffic signs on the road
Can be used to build high-precision environmental perception systems
Urban Management
Urban Infrastructure Monitoring
Automatically identify and segment urban elements such as buildings, roads, and green belts
Assists in urban planning and management decisions
Featured Recommended AI Models
Š 2025AIbase