M

Mask2former Swin Tiny Cityscapes Semantic

Developed by facebook
Mask2Former is a unified image segmentation framework capable of handling instance segmentation, semantic segmentation, and panoptic segmentation tasks. This model is based on the Swin-Tiny backbone network and has been fine-tuned for semantic segmentation on the Cityscapes dataset.
Downloads 55.98k
Release Time : 1/5/2023

Model Overview

Mask2Former unifies instance segmentation, semantic segmentation, and panoptic segmentation by predicting a set of masks and their corresponding labels, treating all three tasks as instance segmentation problems. Compared to its predecessor MaskFormer, Mask2Former shows significant improvements in both performance and efficiency.

Model Features

Unified Segmentation Framework
Unifies instance segmentation, semantic segmentation, and panoptic segmentation into a single framework
Efficient Attention Mechanism
Uses multi-scale deformable attention Transformer to replace traditional pixel decoders
Masked Attention Mechanism
Introduces Transformer decoder with masked attention mechanism, improving performance without increasing computational cost
Efficient Training Strategy
Significantly improves training efficiency by computing loss on sampled points rather than entire masks

Model Capabilities

Image Segmentation
Semantic Segmentation
Instance Segmentation
Panoptic Segmentation

Use Cases

Autonomous Driving
Street Scene Semantic Segmentation
Performs semantic segmentation on urban street scenes to identify elements such as roads, buildings, and pedestrians
Excellent performance on the Cityscapes dataset
Medical Imaging
Medical Image Analysis
Can be used for organ or lesion segmentation in medical images
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase