Sam2-hiera-base-plus Open-source Model - Free Deployment, Supports Efficient Segmentation with Image and Video Prompts

Sam2 Hiera Base Plus

Developed by facebook

SAM 2 is a foundational model for promptable visual segmentation in images and videos developed by FAIR, supporting efficient segmentation through prompts.

Image Segmentation Open Source License:Apache-2.0 #Promptable Segmentation #Video Object Tracking #Multimodal Input

Downloads 18.17k

Release Time : 8/2/2024

Model Overview

SAM 2 is a foundational model for image and video segmentation, capable of quickly generating high-quality segmentation masks based on user-provided prompts (such as points or boxes).

Model Features

Promptable Segmentation

Supports interactive segmentation through prompts such as points or boxes.

Video Segmentation

Capable of processing video sequences, supporting mask propagation across frames.

Efficient Inference

Achieves efficient inference using bfloat16 precision and CUDA acceleration.

Model Capabilities

Image Segmentation

Video Segmentation

Interactive Segmentation

Mask Generation

Use Cases

Computer Vision

Image Editing

Quickly isolate objects in images for editing.

High-quality object segmentation masks.

Video Analysis

Track object movements in videos.

Consistent object segmentation across frames.

🚀 SAM 2: Segment Anything in Images and Videos

This repository is for SAM 2, a foundation model developed by FAIR to solve promptable visual segmentation in images and videos. For more information, please refer to the SAM 2 paper.

The official code is publicly released in this repo.

🚀 Quick Start

✨ Features

Image and Video Segmentation: Capable of performing promptable visual segmentation in both images and videos.

📦 Installation

No specific installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

Basic Usage

Image Prediction

import torch
from sam2.sam2_image_predictor import SAM2ImagePredictor

predictor = SAM2ImagePredictor.from_pretrained("facebook/sam2-hiera-base-plus")

with torch.inference_mode(), torch.autocast("cuda", dtype=torch.bfloat16):
    predictor.set_image(<your_image>)
    masks, _, _ = predictor.predict(<input_prompts>)

Video Prediction

import torch
from sam2.sam2_video_predictor import SAM2VideoPredictor

predictor = SAM2VideoPredictor.from_pretrained("facebook/sam2-hiera-base-plus")

with torch.inference_mode(), torch.autocast("cuda", dtype=torch.bfloat16):
    state = predictor.init_state(<your_video>)

    # add new prompts and instantly get the output on the same frame
    frame_idx, object_ids, masks = predictor.add_new_points_or_box(state, <your_prompts>):

    # propagate the prompts to get masklets throughout the video
    for frame_idx, object_ids, masks in predictor.propagate_in_video(state):
        ...

Refer to the demo notebooks for details.

📚 Documentation

No detailed documentation content other than usage examples is provided in the original document, so this section is skipped.

🔧 Technical Details

No specific technical details are provided in the original document, so this section is skipped.

📄 License

The project is licensed under the Apache-2.0 license.

Citation

To cite the paper, model, or software, please use the below:

@article{ravi2024sam2,
  title={SAM 2: Segment Anything in Images and Videos},
  author={Ravi, Nikhila and Gabeur, Valentin and Hu, Yuan-Ting and Hu, Ronghang and Ryali, Chaitanya and Ma, Tengyu and Khedr, Haitham and R{\"a}dle, Roman and Rolland, Chloe and Gustafson, Laura and Mintun, Eric and Pan, Junting and Alwala, Kalyan Vasudev and Carion, Nicolas and Wu, Chao-Yuan and Girshick, Ross and Doll{\'a}r, Piotr and Feichtenhofer, Christoph},
  journal={arXiv preprint arXiv:2408.00714},
  url={https://arxiv.org/abs/2408.00714},
  year={2024}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご