MaskFormer-Swin-Base-COCO Open-Source Panoptic Segmentation Model - Unified Handling of Instance/Semantic/Segmentation Tasks

Maskformer Swin Base Coco

Developed by facebook

A panoptic segmentation model based on the Swin backbone network, trained on the COCO dataset, unifying instance/semantic/panoptic segmentation tasks

Image Segmentation

Transformers

Open Source License:Other #Panoptic Segmentation #Swin Backbone Network #Mask Prediction

Downloads 3,855

Release Time : 3/2/2022

Model Overview

MaskFormer unifies segmentation tasks by predicting a set of masks and their corresponding labels, treating all segmentation as instance segmentation. This checkpoint is optimized for semantic segmentation tasks.

Model Features

Unified Segmentation Paradigm

Unifies instance/semantic/panoptic segmentation as a mask prediction problem, simplifying task processing

Swin Backbone Network

Uses the efficient Swin Transformer as the feature extraction backbone, balancing global context and local details

End-to-End Training

Directly predicts binary masks and class labels without relying on ROI operations or post-processing grouping

Model Capabilities

Image Semantic Segmentation

Instance-Level Object Recognition

Panoptic Scene Parsing

Use Cases

Computer Vision

Scene Understanding

Performs pixel-level classification and segmentation of objects in complex scenes

Can output segmentation mask images with semantic labels

Autonomous Driving

Real-time parsing of drivable areas, vehicles, and pedestrians in road scenes

🚀 MaskFormer

The MaskFormer model is trained on COCO panoptic segmentation (base - sized version, Swin backbone). It offers a unified approach to address instance, semantic, and panoptic segmentation.

🚀 Quick Start

The MaskFormer model trained here is for COCO panoptic segmentation. You can use it as follows:

from transformers import MaskFormerFeatureExtractor, MaskFormerForInstanceSegmentation
from PIL import Image
import requests

# load MaskFormer fine-tuned on COCO panoptic segmentation
feature_extractor = MaskFormerFeatureExtractor.from_pretrained("facebook/maskformer-swin-base-coco")
model = MaskFormerForInstanceSegmentation.from_pretrained("facebook/maskformer-swin-base-coco")

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = feature_extractor(images=image, return_tensors="pt")

outputs = model(**inputs)
# model predicts class_queries_logits of shape `(batch_size, num_queries)`
# and masks_queries_logits of shape `(batch_size, num_queries, height, width)`
class_queries_logits = outputs.class_queries_logits
masks_queries_logits = outputs.masks_queries_logits

# you can pass them to feature_extractor for postprocessing
result = feature_extractor.post_process_panoptic_segmentation(outputs, target_sizes=[image.size[::-1]])[0]
# we refer to the demo notebooks for visualization (see "Resources" section in the MaskFormer docs)
predicted_panoptic_map = result["segmentation"]

For more code examples, refer to the documentation.

✨ Features

Unified Paradigm: MaskFormer addresses instance, semantic and panoptic segmentation with the same paradigm by predicting a set of masks and corresponding labels, treating all 3 tasks as instance segmentation.
Trained on COCO: This model is trained on the COCO panoptic segmentation dataset, ensuring good performance on related tasks.

📚 Documentation

Model description

MaskFormer addresses instance, semantic and panoptic segmentation with the same paradigm: by predicting a set of masks and corresponding labels. Hence, all 3 tasks are treated as if they were instance segmentation.

model image

Intended uses & limitations

You can use this particular checkpoint for semantic segmentation. See the model hub to look for other fine - tuned versions on a task that interests you.

How to use

As shown in the quick start section, you can use the model for semantic segmentation tasks.

Technical Information

Model Type: MaskFormer model trained on COCO panoptic segmentation (base - sized version, Swin backbone).
Training Data: COCO panoptic segmentation dataset.

Property	Details
Model Type	MaskFormer model trained on COCO panoptic segmentation (base - sized version, Swin backbone)
Training Data	COCO panoptic segmentation dataset

Disclaimer

The team releasing MaskFormer did not write a model card for this model so this model card has been written by the Hugging Face team.

📄 License

License: other

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご