Mask2former Swin Base Ade Semantic
A general-purpose image segmentation model trained on the ADE20k dataset, using a unified framework to handle instance/semantic/panoptic segmentation tasks
Downloads 2,811
Release Time : 1/5/2023
Model Overview
Mask2Former is a Transformer-based general-purpose image segmentation model that unifies instance segmentation, semantic segmentation, and panoptic segmentation by predicting a set of masks and their corresponding labels. Compared to its predecessor MaskFormer, it shows significant improvements in both performance and efficiency.
Model Features
Unified Segmentation Framework
Treats instance segmentation, semantic segmentation, and panoptic segmentation uniformly as instance segmentation tasks
Efficient Attention Mechanism
Uses multi-scale deformable attention Transformer to replace traditional pixel decoders
Masked Attention Decoder
Introduces Transformer decoder with masked attention to improve performance without increasing computational cost
Efficient Training Strategy
Significantly improves training efficiency by computing loss on sampled points rather than entire masks
Model Capabilities
Instance Segmentation
Semantic Segmentation
Panoptic Segmentation
Multi-scale Image Analysis
Use Cases
Computer Vision
Scene Understanding
Accurate segmentation and classification of objects in complex scenes
Can recognize 150 semantic labels from the ADE20k dataset
Autonomous Driving
Real-time semantic segmentation of road scenes
Featured Recommended AI Models
Š 2025AIbase