M

Mask2former Swin Small Cityscapes Instance

Developed by facebook
Mask2Former is a unified image segmentation model based on Transformer, using mask attention mechanism to improve performance
Downloads 43
Release Time : 1/5/2023

Model Overview

This model is a small version of Mask2Former, using Swin Transformer as the backbone network, specifically fine-tuned for instance segmentation tasks on the Cityscapes dataset. It employs a unified architecture to handle instance segmentation, semantic segmentation, and panoptic segmentation tasks.

Model Features

Unified Segmentation Architecture
Uses a unified paradigm to handle instance segmentation, semantic segmentation, and panoptic segmentation tasks
Mask Attention Mechanism
Introduces a Transformer decoder with masked attention, improving performance without increasing computational cost
Efficient Training Strategy
Calculates loss through sampled points rather than entire masks, significantly improving training efficiency

Model Capabilities

Image Instance Segmentation
Multi-scale Feature Extraction
High-precision Object Boundary Recognition

Use Cases

Autonomous Driving
Street Scene Object Recognition
Identifies instances such as vehicles and pedestrians in urban street scenes
Performs excellently on the Cityscapes dataset
Smart Surveillance
Scene Analysis
Performs precise segmentation and recognition of objects in surveillance footage
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase