オープンソースのsam2 - hiera - tinyモデル - 効率的な画像とビデオのプロンプト式ビジュアルセグメンテーションツール

ホーム

Sam2 Hiera Tiny

facebookによって開発

SAM 2はFAIRが開発した画像とビデオ向けのプロンプト可能な視覚セグメンテーションの基礎モデルで、効率的なセグメンテーションをサポートします。

画像セグメンテーションオープンソースライセンス:Apache-2.0 #プロンプト可能なセグメンテーション #ビデオオブジェクトトラッキング #ゼロショット学習

ダウンロード数 41.88k

リリース時間 : 8/2/2024

モデル概要

SAM 2は先進的な視覚セグメンテーションモデルで、ユーザーが提供するプロンプト（点やボックスなど）に基づいて画像やビデオ内で高品質なセグメンテーションマスクを迅速に生成できます。

モデル特徴

マルチモーダルプロンプトサポート

点、ボックスなどのさまざまなプロンプト方法によるインタラクティブなセグメンテーションをサポート

画像とビデオの汎用性

同じモデルアーキテクチャで画像とビデオのセグメンテーションタスクを同時に処理可能

効率的な推論

bfloat16精度とCUDAアクセラレーションをサポートし、高速な推論を実現

リアルタイム伝播

ビデオ処理時にプロンプトをリアルタイムで伝播し、オブジェクトを追跡可能

モデル能力

画像セグメンテーション

ビデオオブジェクトセグメンテーション

インタラクティブセグメンテーション

マスク生成

使用事例

コンピュータビジョン

画像編集

画像内のオブジェクトを迅速に分離して編集

高品質なオブジェクトセグメンテーションマスク

ビデオ分析

ビデオ内の特定のオブジェクトを追跡

フレーム間で一貫したオブジェクトセグメンテーション

拡張現実

ARコンテンツオーバーレイ

現実のシーン内のオブジェクトをリアルタイムでセグメンテーション

ARアプリケーションに正確なオブジェクト境界を提供

🚀 SAM 2: 画像と動画におけるセグメンテーション

SAM 2は、FAIRによる画像と動画のプロンプト可能なビジュアルセグメンテーションを解決するための基盤モデルです。詳細については、SAM 2論文を参照してください。

公式コードはこのリポジトリで公開されています。

🚀 クイックスタート

💻 使用例

基本的な使用法

画像予測

import torch
from sam2.sam2_image_predictor import SAM2ImagePredictor

predictor = SAM2ImagePredictor.from_pretrained("facebook/sam2-hiera-tiny")

with torch.inference_mode(), torch.autocast("cuda", dtype=torch.bfloat16):
    predictor.set_image(<your_image>)
    masks, _, _ = predictor.predict(<input_prompts>)

動画予測

import torch
from sam2.sam2_video_predictor import SAM2VideoPredictor

predictor = SAM2VideoPredictor.from_pretrained("facebook/sam2-hiera-tiny")

with torch.inference_mode(), torch.autocast("cuda", dtype=torch.bfloat16):
    state = predictor.init_state(<your_video>)

    # add new prompts and instantly get the output on the same frame
    frame_idx, object_ids, masks = predictor.add_new_points_or_box(state, <your_prompts>):

    # propagate the prompts to get masklets throughout the video
    for frame_idx, object_ids, masks in predictor.propagate_in_video(state):
        ...

詳細については、デモノートブックを参照してください。

📄 ライセンス

このプロジェクトはApache-2.0ライセンスの下で公開されています。

📚 引用

論文、モデル、またはソフトウェアを引用する場合は、以下を使用してください。

@article{ravi2024sam2,
  title={SAM 2: Segment Anything in Images and Videos},
  author={Ravi, Nikhila and Gabeur, Valentin and Hu, Yuan-Ting and Hu, Ronghang and Ryali, Chaitanya and Ma, Tengyu and Khedr, Haitham and R{\"a}dle, Roman and Rolland, Chloe and Gustafson, Laura and Mintun, Eric and Pan, Junting and Alwala, Kalyan Vasudev and Carion, Nicolas and Wu, Chao-Yuan and Girshick, Ross and Doll{\'a}r, Piotr and Feichtenhofer, Christoph},
  journal={arXiv preprint arXiv:2408.00714},
  url={https://arxiv.org/abs/2408.00714},
  year={2024}
}