Finetune-instance-segmentation-ade20k-mini-mask2formerオープンソースモデル - 画像中の人と自動車のクラスを正確に識別

ホーム

Finetune Instance Segmentation Ade20k Mini Mask2former

qubvel-hfによって開発

これはADE20K-miniデータセットでファインチューニングされたMask2Formerモデルで、画像中の「人」と「車」のカテゴリを識別するインスタンスセグメンテーションタスク専用です。

画像セグメンテーション

Transformers

#インスタンスセグメンテーション #ADE20Kファインチューニング #小サイズ画像処理

ダウンロード数 4,500

リリース時間 : 5/26/2024

モデル概要

このモデルはMask2Formerアーキテクチャに基づき、ADE20Kデータセットのサブセットでファインチューニングされ、画像中の人物と車両オブジェクトを正確にセグメント化するインスタンスセグメンテーションタスクに特化しています。

モデル特徴

効率的なインスタンスセグメンテーション

画像中の人物と車両オブジェクトを正確に識別・セグメント化可能

軽量モデル

Swin-Tinyアーキテクチャベースで、性能を維持しながら計算リソース要件を低減

高速推論

最適化されたモデルにより、インスタンスセグメンテーションタスクを合理的な時間で完了

モデル能力

画像セグメンテーション

物体認識

インスタンスセグメンテーション

コンピュータビジョン

使用事例

インテリジェント監視

群衆検出

監視映像内の群衆を検出・セグメント化

画像中の人体輪郭を正確に識別可能

自動運転

車両識別

道路上の車両を識別し輪郭をセグメント化

異なる車両インスタンスを正確にセグメント化可能

🚀 インスタンスセグメンテーションの例

このプロジェクトは画像セグメンテーションに特化しており、Mask2Formerモデルを使用してADE20Kデータセットの一部をファインチューニングし、インスタンスセグメンテーションを行います。

🚀 クイックスタート

コンテンツ:

Trainerを使用したPyTorchバージョン
再読み込みと推論の実行
カスタムデータに関する注意事項

✨ 主な機能

このモデルは、run_instance_segmentation.py スクリプトに基づいています。
🤗 Trainer API を使用して、分散環境を含むトレーニングを自動的に管理します。
Mask2Former モデルを ADE20K データセットのサブサンプルでファインチューニングします。

📦 インストール

トレーニングは以下のコマンドで行います:

python run_instance_segmentation.py \
    --model_name_or_path facebook/mask2former-swin-tiny-coco-instance \
    --output_dir finetune-instance-segmentation-ade20k-mini-mask2former \
    --dataset_name qubvel-hf/ade20k-mini \
    --do_reduce_labels \
    --image_height 256 \
    --image_width 256 \
    --do_train \
    --fp16 \
    --num_train_epochs 40 \
    --learning_rate 1e-5 \
    --lr_scheduler_type constant \
    --per_device_train_batch_size 8 \
    --gradient_accumulation_steps 2 \
    --dataloader_num_workers 8 \
    --dataloader_persistent_workers \
    --dataloader_prefetch_factor 4 \
    --do_eval \
    --evaluation_strategy epoch \
    --logging_strategy epoch \
    --save_strategy epoch \
    --save_total_limit 2 \
    --push_to_hub

💻 使用例

基本的な使用法

このモデルの label2id マッピングは次の通りです:

label2id = {
    "person": 0,
    "car": 1,
}

高度な使用法

トレーニング済みのモデルを簡単に読み込んで推論を実行することができます:

import torch
import requests
import matplotlib.pyplot as plt

from PIL import Image
from transformers import Mask2FormerForUniversalSegmentation, Mask2FormerImageProcessor

# 画像の読み込み
image = Image.open(requests.get("http://farm4.staticflickr.com/3017/3071497290_31f0393363_z.jpg", stream=True).raw)

# モデルと画像プロセッサの読み込み
device = "cuda"
checkpoint = "qubvel-hf/finetune-instance-segmentation-ade20k-mini-mask2former"

model = Mask2FormerForUniversalSegmentation.from_pretrained(checkpoint, device_map=device)
image_processor = Mask2FormerImageProcessor.from_pretrained(checkpoint)

# 画像で推論を実行
inputs = image_processor(images=[image], return_tensors="pt").to(device)
with torch.no_grad():
    outputs = model(**inputs)

# 出力の後処理
outputs = image_processor.post_process_instance_segmentation(outputs, target_sizes=[image.size[::-1]])

print("Mask shape: ", outputs[0]["segmentation"].shape)
print("Mask values: ", outputs[0]["segmentation"].unique())
for segment in outputs[0]["segments_info"]:
    print("Segment: ", segment)

Mask shape:  torch.Size([427, 640])
Mask values:  tensor([-1.,  0.,  1.,  2.,  3.,  4.,  5.,  6.])
Segment:  {'id': 0, 'label_id': 0, 'was_fused': False, 'score': 0.946127}
Segment:  {'id': 1, 'label_id': 1, 'was_fused': False, 'score': 0.961582}
Segment:  {'id': 2, 'label_id': 1, 'was_fused': False, 'score': 0.968367}
Segment:  {'id': 3, 'label_id': 1, 'was_fused': False, 'score': 0.819527}
Segment:  {'id': 4, 'label_id': 1, 'was_fused': False, 'score': 0.655761}
Segment:  {'id': 5, 'label_id': 1, 'was_fused': False, 'score': 0.531299}
Segment:  {'id': 6, 'label_id': 1, 'was_fused': False, 'score': 0.929477}

結果を視覚化するには、次のコードを使用します:

import numpy as np
import matplotlib.pyplot as plt

segmentation = outputs[0]["segmentation"].numpy()

plt.figure(figsize=(10, 10))
plt.subplot(1, 2, 1)
plt.imshow(np.array(image))
plt.axis("off")
plt.subplot(1, 2, 2)
plt.imshow(segmentation)
plt.axis("off")
plt.show()

Result

📄 ライセンス

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.