M

Mask2former Swin Base IN21k Cityscapes Panoptic

Developed by facebook
Mask2Former is a general-purpose image segmentation model based on Transformer architecture, capable of handling instance segmentation, semantic segmentation, and panoptic segmentation tasks.
Downloads 140
Release Time : 1/5/2023

Model Overview

This model uses Swin Transformer as the backbone network and is fine-tuned for panoptic segmentation on the Cityscapes dataset. It achieves segmentation tasks by predicting a set of masks and their corresponding labels.

Model Features

Unified Segmentation Paradigm
Unifies instance segmentation, semantic segmentation, and panoptic segmentation as instance segmentation tasks.
Efficient Attention Mechanism
Ups performance with multi-scale deformable attention Transformer and masked attention mechanisms.
Training Optimization
Significantly improves training efficiency by computing loss on subsampled points rather than entire masks.

Model Capabilities

Image Segmentation
Panoptic Segmentation
Instance Segmentation
Semantic Segmentation

Use Cases

Autonomous Driving
Street Scene Understanding
Used for comprehensive understanding of urban street scenes in autonomous driving systems.
Accurately identifies elements such as roads, vehicles, pedestrians, and their relationships.
Urban Mapping
Urban Element Segmentation
Used for urban map creation and updates.
Automatically identifies urban elements such as buildings, roads, and green spaces.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase