許可協議:apache-2.0
任務類型:掩碼生成
庫名稱:sam2
SAM 2(Segment Anything in Images and Videos)代碼庫,這是FAIR提出的一個基礎模型,旨在解決圖像和視頻中可提示視覺分割問題。更多詳情請參閱SAM 2論文。
官方代碼已在此代碼庫中公開發布。
使用方法
圖像預測:
import torch
from sam2.sam2_image_predictor import SAM2ImagePredictor
predictor = SAM2ImagePredictor.from_pretrained("facebook/sam2-hiera-base-plus")
with torch.inference_mode(), torch.autocast("cuda", dtype=torch.bfloat16):
predictor.set_image(<你的圖像>)
masks, _, _ = predictor.predict(<輸入提示>)
視頻預測:
import torch
from sam2.sam2_video_predictor import SAM2VideoPredictor
predictor = SAM2VideoPredictor.from_pretrained("facebook/sam2-hiera-base-plus")
with torch.inference_mode(), torch.autocast("cuda", dtype=torch.bfloat16):
state = predictor.init_state(<你的視頻>)
frame_idx, object_ids, masks = predictor.add_new_points_or_box(state, <你的提示>):
for frame_idx, object_ids, masks in predictor.propagate_in_video(state):
...
詳情請參考演示筆記本。
引用
如需引用論文、模型或軟件,請使用以下格式:
@article{ravi2024sam2,
title={SAM 2: Segment Anything in Images and Videos},
author={Ravi, Nikhila and Gabeur, Valentin and Hu, Yuan-Ting and Hu, Ronghang and Ryali, Chaitanya and Ma, Tengyu and Khedr, Haitham and R{\"a}dle, Roman and Rolland, Chloe and Gustafson, Laura and Mintun, Eric and Pan, Junting and Alwala, Kalyan Vasudev and Carion, Nicolas and Wu, Chao-Yuan and Girshick, Ross and Doll{\'a}r, Piotr and Feichtenhofer, Christoph},
journal={arXiv preprint arXiv:2408.00714},
url={https://arxiv.org/abs/2408.00714},
year={2024}
}