M

Mask2former Swin Base IN21k Ade Semantic

Developed by facebook
Mask2Former is a universal image segmentation model capable of handling instance segmentation, semantic segmentation, and panoptic segmentation tasks by predicting a set of masks and their corresponding labels.
Downloads 879
Release Time : 1/5/2023

Model Overview

This model adopts the Swin backbone network and is fine-tuned on the ADE20k dataset for semantic segmentation tasks, providing efficient and accurate segmentation capabilities through an improved Transformer architecture.

Model Features

Unified Segmentation Architecture
Handles instance segmentation, semantic segmentation, and panoptic segmentation tasks with a single model architecture.
Improved Transformer Design
Utilizes multi-scale deformable attention Transformer and masked attention Transformer decoder to enhance performance and efficiency.
Efficient Training Method
Significantly improves training efficiency by computing loss through sampled points rather than entire masks.

Model Capabilities

Image Semantic Segmentation
Image Instance Segmentation
Image Panoptic Segmentation
Multi-scale Image Analysis

Use Cases

Computer Vision
Scene Understanding
Identify and segment different objects in complex scenes.
Accurately identify and segment various objects in the scene.
Autonomous Driving
Analyze road scenes to identify vehicles, pedestrians, road signs, etc.
Provide precise environmental perception for autonomous driving systems.
Medical Imaging
Medical Image Analysis
Segment organs or lesion areas in medical images.
Assist doctors in diagnosis and treatment planning.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase