🚀 SAM 2:圖像和視頻中的任意分割
SAM 2 是一個基礎模型,由 FAIR 開發,旨在解決圖像和視頻中的可提示視覺分割問題。更多信息請參閱 SAM 2 論文。
官方代碼已在這個 倉庫 中公開。
🚀 快速開始
💻 使用示例
基礎用法
以下是圖像預測的代碼示例:
import torch
from sam2.sam2_image_predictor import SAM2ImagePredictor
predictor = SAM2ImagePredictor.from_pretrained("facebook/sam2-hiera-small")
with torch.inference_mode(), torch.autocast("cuda", dtype=torch.bfloat16):
predictor.set_image(<your_image>)
masks, _, _ = predictor.predict(<input_prompts>)
高級用法
以下是視頻預測的代碼示例:
import torch
from sam2.sam2_video_predictor import SAM2VideoPredictor
predictor = SAM2VideoPredictor.from_pretrained("facebook/sam2-hiera-small")
with torch.inference_mode(), torch.autocast("cuda", dtype=torch.bfloat16):
state = predictor.init_state(<your_video>)
frame_idx, object_ids, masks = predictor.add_new_points_or_box(state, <your_prompts>):
for frame_idx, object_ids, masks in predictor.propagate_in_video(state):
...
詳細信息請參考 演示筆記本。
📄 許可證
本項目採用 Apache-2.0 許可證。
📚 引用
如需引用本文、模型或軟件,請使用以下 BibTeX 格式:
@article{ravi2024sam2,
title={SAM 2: Segment Anything in Images and Videos},
author={Ravi, Nikhila and Gabeur, Valentin and Hu, Yuan-Ting and Hu, Ronghang and Ryali, Chaitanya and Ma, Tengyu and Khedr, Haitham and R{\"a}dle, Roman and Rolland, Chloe and Gustafson, Laura and Mintun, Eric and Pan, Junting and Alwala, Kalyan Vasudev and Carion, Nicolas and Wu, Chao-Yuan and Girshick, Ross and Doll{\'a}r, Piotr and Feichtenhofer, Christoph},
journal={arXiv preprint arXiv:2408.00714},
url={https://arxiv.org/abs/2408.00714},
year={2024}
}