đ MaskFormer
The MaskFormer model is trained on COCO panoptic segmentation (small - sized version, Swin backbone). It offers a unified approach for instance, semantic, and panoptic segmentation.
đ Quick Start
The MaskFormer model trained on COCO panoptic segmentation can be used for semantic segmentation tasks. You can find other fine - tuned versions on the model hub.
⨠Features
- Unified Paradigm: MaskFormer addresses instance, semantic, and panoptic segmentation using the same approach, by predicting a set of masks and corresponding labels.
- Visual Representation: You can visualize the model architecture through the following image:

đĻ Installation
No specific installation steps are provided in the original document, so this section is skipped.
đģ Usage Examples
Basic Usage
from transformers import MaskFormerFeatureExtractor, MaskFormerForInstanceSegmentation
from PIL import Image
import requests
feature_extractor = MaskFormerFeatureExtractor.from_pretrained("facebook/maskformer-swin-small-coco")
model = MaskFormerForInstanceSegmentation.from_pretrained("facebook/maskformer-swin-small-coco")
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = feature_extractor(images=image, return_tensors="pt")
outputs = model(**inputs)
class_queries_logits = outputs.class_queries_logits
masks_queries_logits = outputs.masks_queries_logits
result = feature_extractor.post_process_panoptic_segmentation(outputs, target_sizes=[image.size[::-1]])[0]
predicted_panoptic_map = result["segmentation"]
For more code examples, refer to the documentation.
đ Documentation
The MaskFormer model was introduced in the paper Per - Pixel Classification is Not All You Need for Semantic Segmentation and first released in this repository.
Disclaimer: The team releasing MaskFormer did not write a model card for this model, so this model card has been written by the Hugging Face team.
đ§ Technical Details
MaskFormer addresses instance, semantic, and panoptic segmentation with the same paradigm: by predicting a set of masks and corresponding labels. Hence, all 3 tasks are treated as if they were instance segmentation.
đ License
The license for this model is "other".
Property |
Details |
Model Type |
MaskFormer (small - sized version, Swin backbone) |
Training Data |
COCO |
Tags |
vision, image - segmentation |
Widget Example 1 |
Cats |
Widget Example 2 |
Castle |