M

Mask2former Swin Small Cityscapes Semantic

Developed by facebook
Small version of Mask2Former based on Swin backbone network, specifically trained for Cityscapes semantic segmentation tasks
Downloads 952
Release Time : 1/5/2023

Model Overview

Mask2Former is a universal image segmentation model that uses a unified paradigm to handle instance segmentation, semantic segmentation, and panoptic segmentation tasks. It achieves segmentation by predicting a set of masks and their corresponding labels.

Model Features

Unified Segmentation Paradigm
Unifies instance segmentation, semantic segmentation, and panoptic segmentation as instance segmentation tasks
Efficient Attention Mechanism
Uses multi-scale deformable attention Transformer to replace traditional pixel decoders
Masked Attention Decoder
Introduces a Transformer decoder with masked attention, improving performance without increasing computational cost
Efficient Training Method
Calculates loss via sampled points rather than entire masks, significantly improving training efficiency

Model Capabilities

Image Semantic Segmentation
Multi-category Object Recognition
High-precision Mask Prediction

Use Cases

Autonomous Driving
Street Scene Semantic Segmentation
Performs pixel-level classification of urban road scenes to identify elements such as roads, vehicles, and pedestrians
Excellent performance on the Cityscapes dataset
Remote Sensing Image Analysis
Land Cover Classification
Performs segmentation of land cover types on satellite or aerial images
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase