Open-source MaskFormer-Swin-Small-Coco Model - A Practical Choice for Panoptic Segmentation Tasks

Maskformer Swin Small Coco

Developed by facebook

A small MaskFormer model based on the Swin backbone network, trained on the COCO dataset for panoptic segmentation tasks.

Image Segmentation

Transformers

Open Source License:Other #Panoptic Segmentation #Swin Backbone Network #Unified Segmentation Paradigm

Downloads 2,293

Release Time : 3/2/2022

Model Overview

MaskFormer adopts a unified paradigm to handle instance segmentation, semantic segmentation, and panoptic segmentation tasks by predicting a set of masks and their corresponding labels.

Model Features

Unified Segmentation Paradigm

Unifies instance segmentation, semantic segmentation, and panoptic segmentation as instance segmentation problems.

Swin Backbone Network

Uses the efficient Swin Transformer as the backbone network.

Trained on COCO Dataset

Trained on the standard COCO dataset, demonstrating strong generalization capabilities.

Model Capabilities

Image Segmentation

Semantic Segmentation

Instance Segmentation

Panoptic Segmentation

Use Cases

Computer Vision

Object Recognition and Segmentation

Identifies objects in images and generates precise pixel-level segmentation masks.

Performs well on the COCO dataset.

Scene Understanding

Conducts comprehensive semantic and instance analysis of complex scenes.

🚀 MaskFormer

The MaskFormer model is trained on COCO panoptic segmentation (small - sized version, Swin backbone). It offers a unified approach for instance, semantic, and panoptic segmentation.

🚀 Quick Start

The MaskFormer model trained on COCO panoptic segmentation can be used for semantic segmentation tasks. You can find other fine - tuned versions on the model hub.

✨ Features

Unified Paradigm: MaskFormer addresses instance, semantic, and panoptic segmentation using the same approach, by predicting a set of masks and corresponding labels.
Visual Representation: You can visualize the model architecture through the following image:

📦 Installation

No specific installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

Basic Usage

from transformers import MaskFormerFeatureExtractor, MaskFormerForInstanceSegmentation
from PIL import Image
import requests

# load MaskFormer fine-tuned on COCO panoptic segmentation
feature_extractor = MaskFormerFeatureExtractor.from_pretrained("facebook/maskformer-swin-small-coco")
model = MaskFormerForInstanceSegmentation.from_pretrained("facebook/maskformer-swin-small-coco")

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = feature_extractor(images=image, return_tensors="pt")

outputs = model(**inputs)
# model predicts class_queries_logits of shape `(batch_size, num_queries)`
# and masks_queries_logits of shape `(batch_size, num_queries, height, width)`
class_queries_logits = outputs.class_queries_logits
masks_queries_logits = outputs.masks_queries_logits

# you can pass them to feature_extractor for postprocessing
result = feature_extractor.post_process_panoptic_segmentation(outputs, target_sizes=[image.size[::-1]])[0]
# we refer to the demo notebooks for visualization (see "Resources" section in the MaskFormer docs)
predicted_panoptic_map = result["segmentation"]

For more code examples, refer to the documentation.

📚 Documentation

The MaskFormer model was introduced in the paper Per - Pixel Classification is Not All You Need for Semantic Segmentation and first released in this repository.

Disclaimer: The team releasing MaskFormer did not write a model card for this model, so this model card has been written by the Hugging Face team.

🔧 Technical Details

MaskFormer addresses instance, semantic, and panoptic segmentation with the same paradigm: by predicting a set of masks and corresponding labels. Hence, all 3 tasks are treated as if they were instance segmentation.

📄 License

The license for this model is "other".

Property	Details
Model Type	MaskFormer (small - sized version, Swin backbone)
Training Data	COCO
Tags	vision, image - segmentation
Widget Example 1	Cats
Widget Example 2	Castle

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご