Finetune-instance-segmentation-ade20k-mini-mask2former Open-Source Model - Accurately Identify Categories of People and Cars in Images

Finetune Instance Segmentation Ade20k Mini Mask2former

Developed by qubvel-hf

This is a Mask2Former model fine-tuned on the ADE20K-mini dataset, specifically designed for instance segmentation tasks, capable of identifying 'person' and 'car' categories in images.

Image Segmentation

Transformers

#Instance Segmentation #ADE20K Fine-tuning #Small Image Processing

Downloads 4,500

Release Time : 5/26/2024

Model Overview

This model is based on the Mask2Former architecture and has been fine-tuned on a subset of the ADE20K dataset, focusing on instance segmentation tasks to accurately segment person and car objects in images.

Model Features

Efficient Instance Segmentation

Capable of accurately identifying and segmenting person and car objects in images.

Lightweight Model

Based on the Swin-Tiny architecture, it reduces computational resource requirements while maintaining performance.

Fast Inference

The optimized model can complete instance segmentation tasks within a reasonable time frame.

Model Capabilities

Image Segmentation

Object Recognition

Instance Segmentation

Computer Vision

Use Cases

Smart Surveillance

Crowd Detection

Detect and segment crowds in surveillance videos.

Can accurately identify human contours in images.

Autonomous Driving

Vehicle Recognition

Identify and segment vehicle contours on roads.

Can accurately segment different vehicle instances.

🚀 Instance Segmentation Example

This project demonstrates instance segmentation using a fine - tuned Mask2Former model on a subset of the ADE20K dataset.

🚀 Quick Start

This README provides a step - by - step guide on fine - tuning a Mask2Former model for instance segmentation and performing inference.

✨ Features

Fine - tune a Mask2Former model on a custom dataset.
Use the 🤗 Trainer API for automatic training management.
Perform inference on new images and visualize the results.

📦 Installation

Although not explicitly provided in the original, if you want to run the code, you need to install the necessary libraries:

pip install transformers datasets torch requests matplotlib pillow

💻 Usage Examples

Basic Usage

Fine - Tuning the Model

This model is based on the script [run_instance_segmentation.py](https://github.com/huggingface/transformers/blob/main/examples/pytorch/instance - segmentation/run_instance_segmentation.py). We fine - tune a Mask2Former model on a subsample of the ADE20K dataset.

Here is the label2id mapping for this model:

label2id = {
    "person": 0,
    "car": 1,
}

The training was done with the following command:

python run_instance_segmentation.py \
    --model_name_or_path facebook/mask2former - swin - tiny - coco - instance \
    --output_dir finetune - instance - segmentation - ade20k - mini - mask2former \
    --dataset_name qubvel - hf/ade20k - mini \
    --do_reduce_labels \
    --image_height 256 \
    --image_width 256 \
    --do_train \
    --fp16 \
    --num_train_epochs 40 \
    --learning_rate 1e - 5 \
    --lr_scheduler_type constant \
    --per_device_train_batch_size 8 \
    --gradient_accumulation_steps 2 \
    --dataloader_num_workers 8 \
    --dataloader_persistent_workers \
    --dataloader_prefetch_factor 4 \
    --do_eval \
    --evaluation_strategy epoch \
    --logging_strategy epoch \
    --save_strategy epoch \
    --save_total_limit 2 \
    --push_to_hub

Advanced Usage

Reload and Perform Inference

You can easily load this trained model and perform inference as follows:

import torch
import requests
import matplotlib.pyplot as plt

from PIL import Image
from transformers import Mask2FormerForUniversalSegmentation, Mask2FormerImageProcessor

# Load image
image = Image.open(requests.get("http://farm4.staticflickr.com/3017/3071497290_31f0393363_z.jpg", stream=True).raw)

# Load model and image processor
device = "cuda"
checkpoint = "qubvel - hf/finetune - instance - segmentation - ade20k - mini - mask2former"

model = Mask2FormerForUniversalSegmentation.from_pretrained(checkpoint, device_map=device)
image_processor = Mask2FormerImageProcessor.from_pretrained(checkpoint)

# Run inference on image
inputs = image_processor(images=[image], return_tensors="pt").to(device)
with torch.no_grad():
    outputs = model(**inputs)

# Post - process outputs
outputs = image_processor.post_process_instance_segmentation(outputs, target_sizes=[image.size[::-1]])

print("Mask shape: ", outputs[0]["segmentation"].shape)
print("Mask values: ", outputs[0]["segmentation"].unique())
for segment in outputs[0]["segments_info"]:
    print("Segment: ", segment)

The output will be similar to:

Mask shape:  torch.Size([427, 640])
Mask values:  tensor([-1.,  0.,  1.,  2.,  3.,  4.,  5.,  6.])
Segment:  {'id': 0, 'label_id': 0, 'was_fused': False, 'score': 0.946127}
Segment:  {'id': 1, 'label_id': 1, 'was_fused': False, 'score': 0.961582}
Segment:  {'id': 2, 'label_id': 1, 'was_fused': False, 'score': 0.968367}
Segment:  {'id': 3, 'label_id': 1, 'was_fused': False, 'score': 0.819527}
Segment:  {'id': 4, 'label_id': 1, 'was_fused': False, 'score': 0.655761}
Segment:  {'id': 5, 'label_id': 1, 'was_fused': False, 'score': 0.531299}
Segment:  {'id': 6, 'label_id': 1, 'was_fused': False, 'score': 0.929477}

Visualize the Results

Use the following code to visualize the results:

import numpy as np
import matplotlib.pyplot as plt

segmentation = outputs[0]["segmentation"].numpy()

plt.figure(figsize=(10, 10))
plt.subplot(1, 2, 1)
plt.imshow(np.array(image))
plt.axis("off")
plt.subplot(1, 2, 2)
plt.imshow(segmentation)
plt.axis("off")
plt.show()

Result

📄 License

This project is licensed under the Apache License 2.0.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご