M

Mask2former Swin Base IN21k Cityscapes Instance

Developed by facebook
Mask2Former is a Transformer-based general-purpose image segmentation model that unifies instance, semantic, and panoptic segmentation tasks.
Downloads 53
Release Time : 1/5/2023

Model Overview

This model achieves instance segmentation by predicting a set of masks and their corresponding labels, utilizing a Swin Transformer backbone and fine-tuned on the Cityscapes dataset.

Model Features

Unified Segmentation Architecture
Unifies instance, semantic, and panoptic segmentation as a mask prediction problem.
Efficient Attention Mechanism
Utilizes multi-scale deformable attention and masked attention to improve computational efficiency.
Training Optimization
Enhances training efficiency by computing loss on sampled points rather than entire masks.

Model Capabilities

Image Instance Segmentation
Multi-scale Feature Extraction
Efficient Mask Prediction

Use Cases

Computer Vision
Street Scene Analysis
Performs instance segmentation on objects in street scene datasets like Cityscapes.
Accurately identifies and segments objects such as roads, vehicles, and pedestrians.
Object Recognition
Identifies and segments specific object instances in images.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase