Finetune-instance-segmentation-ade20k-mini-mask2former開源模型

首頁

Finetune Instance Segmentation Ade20k Mini Mask2former

由qubvel-hf開發

這是一個在ADE20K-mini數據集上微調的Mask2Former模型，專門用於實例分割任務，能夠識別圖像中的'人'和'汽車'類別。

圖像分割

Transformers

#實例分割 #ADE20K微調 #小尺寸圖像處理

下載量 4,500

發布時間 : 5/26/2024

模型概述

該模型基於Mask2Former架構，在ADE20K數據集的子集上進行了微調，專注於實例分割任務，能夠準確分割圖像中的人和汽車對象。

模型特點

高效實例分割

能夠準確識別和分割圖像中的人和汽車對象

輕量級模型

基於Swin-Tiny架構，在保持性能的同時減少計算資源需求

快速推理

優化後的模型能夠在合理時間內完成實例分割任務

模型能力

圖像分割

對象識別

實例分割

計算機視覺

使用案例

智能監控

人群檢測

在監控視頻中檢測和分割人群

可準確識別圖像中的人體輪廓

自動駕駛

車輛識別

識別道路上的車輛並分割其輪廓

能夠準確分割不同車輛實例

🚀 實例分割示例

本項目專注於圖像分割領域，藉助預訓練模型和特定數據集，實現了實例分割任務的微調訓練與推理，為相關領域的研究和應用提供了實用的範例和代碼參考。

🚀 快速開始

✨ 主要特性

基於 🤗 Trainer API 管理訓練，支持分佈式環境。
對 Mask2Former 模型在 ADE20K 數據集的子樣本上進行微調。
提供了完整的訓練和推理代碼示例。

📦 安裝指南

此部分未提供具體安裝命令，暫不展示安裝指南。

💻 使用示例

基礎用法

PyTorch 版本與訓練器

本模型基於腳本 run_instance_segmentation.py。該腳本使用 🤗 Trainer API 自動管理訓練，包括分佈式環境。

這裡，我們在 ADE20K 數據集的子樣本上微調 Mask2Former 模型。我們創建了一個小數據集，約有 2000 張圖像，僅包含“人”和“汽車”的標註；其他所有像素標記為“背景”。

以下是該模型的 label2id 映射：

label2id = {
    "person": 0,
    "car": 1,
}

使用以下命令進行訓練：

python run_instance_segmentation.py \
    --model_name_or_path facebook/mask2former-swin-tiny-coco-instance \
    --output_dir finetune-instance-segmentation-ade20k-mini-mask2former \
    --dataset_name qubvel-hf/ade20k-mini \
    --do_reduce_labels \
    --image_height 256 \
    --image_width 256 \
    --do_train \
    --fp16 \
    --num_train_epochs 40 \
    --learning_rate 1e-5 \
    --lr_scheduler_type constant \
    --per_device_train_batch_size 8 \
    --gradient_accumulation_steps 2 \
    --dataloader_num_workers 8 \
    --dataloader_persistent_workers \
    --dataloader_prefetch_factor 4 \
    --do_eval \
    --evaluation_strategy epoch \
    --logging_strategy epoch \
    --save_strategy epoch \
    --save_total_limit 2 \
    --push_to_hub

高級用法

重新加載與推理

你可以輕鬆加載訓練好的模型並進行推理，如下所示：

import torch
import requests
import matplotlib.pyplot as plt

from PIL import Image
from transformers import Mask2FormerForUniversalSegmentation, Mask2FormerImageProcessor

# 加載圖像
image = Image.open(requests.get("http://farm4.staticflickr.com/3017/3071497290_31f0393363_z.jpg", stream=True).raw)

# 加載模型和圖像處理器
device = "cuda"
checkpoint = "qubvel-hf/finetune-instance-segmentation-ade20k-mini-mask2former"

model = Mask2FormerForUniversalSegmentation.from_pretrained(checkpoint, device_map=device)
image_processor = Mask2FormerImageProcessor.from_pretrained(checkpoint)

# 在圖像上運行推理
inputs = image_processor(images=[image], return_tensors="pt").to(device)
with torch.no_grad():
    outputs = model(**inputs)

# 後處理輸出
outputs = image_processor.post_process_instance_segmentation(outputs, target_sizes=[image.size[::-1]])

print("Mask shape: ", outputs[0]["segmentation"].shape)
print("Mask values: ", outputs[0]["segmentation"].unique())
for segment in outputs[0]["segments_info"]:
    print("Segment: ", segment)

運行上述代碼後，輸出示例如下：

Mask shape:  torch.Size([427, 640])
Mask values:  tensor([-1.,  0.,  1.,  2.,  3.,  4.,  5.,  6.])
Segment:  {'id': 0, 'label_id': 0, 'was_fused': False, 'score': 0.946127}
Segment:  {'id': 1, 'label_id': 1, 'was_fused': False, 'score': 0.961582}
Segment:  {'id': 2, 'label_id': 1, 'was_fused': False, 'score': 0.968367}
Segment:  {'id': 3, 'label_id': 1, 'was_fused': False, 'score': 0.819527}
Segment:  {'id': 4, 'label_id': 1, 'was_fused': False, 'score': 0.655761}
Segment:  {'id': 5, 'label_id': 1, 'was_fused': False, 'score': 0.531299}
Segment:  {'id': 6, 'label_id': 1, 'was_fused': False, 'score': 0.929477}

使用以下代碼可視化結果：

import numpy as np
import matplotlib.pyplot as plt

segmentation = outputs[0]["segmentation"].numpy()

plt.figure(figsize=(10, 10))
plt.subplot(1, 2, 1)
plt.imshow(np.array(image))
plt.axis("off")
plt.subplot(1, 2, 2)
plt.imshow(segmentation)
plt.axis("off")
plt.show()

📚 詳細文檔

自定義數據說明

此部分未提供詳細說明，暫不展示相關內容。

🔧 技術細節

此部分未提供具體的技術說明（內容少於 50 字），暫不展示技術細節。

📄 許可證

Copyright 2024 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.