Sam-vit-large Open-source Image Segmentation Model - Generate High-quality Object Masks Based on Input Points

Sam Vit Large

Developed by Xenova

Large-scale image segmentation model based on Vision Transformer architecture, capable of generating high-quality object masks from input points

Image Segmentation

Transformers

Other#Image Segmentation #ONNX Compatible #Zero-shot Learning

Downloads 34

Release Time : 5/31/2023

Model Overview

Segment Anything Model (SAM) is a versatile image segmentation model that automatically generates precise object masks based on user-provided input points (e.g., clicks). Built on Vision Transformer architecture, it exhibits strong zero-shot transfer capabilities.

Model Features

Zero-shot Segmentation Capability

Handles various image segmentation tasks without domain-specific training

Interactive Segmentation

Generates precise object masks guided by simple input points

High-quality Output

Produces refined object boundaries and multiple candidate masks

Web Compatibility

Provides ONNX-format weights for browser environment execution

Model Capabilities

Interactive Image Segmentation

Object Mask Generation

Multi-candidate Output

Zero-shot Image Understanding

Use Cases

Image Editing

Object Removal & Replacement

Generates precise masks via simple clicks for photo editing

Achieves accurate object isolation effects

Computer Vision Annotation

Semi-automatic Data Labeling

Significantly reduces manual annotation workload

3-5x improvement in labeling efficiency

AR/VR Applications

Real-time Object Segmentation

Separates foreground objects in augmented reality scenarios

🚀 Segment Anything Model with Transformers.js

This project adapts the facebook/sam-vit-large model with ONNX weights to be compatible with Transformers.js, enabling seamless mask generation in a JavaScript environment.

🚀 Quick Start

The model from https://huggingface.co/facebook/sam-vit-large is adapted with ONNX weights to work with Transformers.js.

📦 Installation

If you haven't already, you can install the Transformers.js JavaScript library from NPM using:

npm i @huggingface/transformers

💻 Usage Examples

Basic Usage

import { SamModel, AutoProcessor, RawImage } from "@huggingface/transformers";

// Load model and processor
const model = await SamModel.from_pretrained("Xenova/sam-vit-large");
const processor = await AutoProcessor.from_pretrained("Xenova/sam-vit-large");

// Prepare image and input points
const img_url = "https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/corgi.jpg";
const raw_image = await RawImage.read(img_url);
const input_points = [[[340, 250]]];

// Process inputs and perform mask generation
const inputs = await processor(raw_image, { input_points });
const outputs = await model(inputs);

// Post-process masks
const masks = await processor.post_process_masks(outputs.pred_masks, inputs.original_sizes, inputs.reshaped_input_sizes);
console.log(masks);
// [
//   Tensor {
//     dims: [ 1, 3, 410, 614 ],
//     type: 'bool',
//     data: Uint8Array(755220) [ ... ],
//     size: 755220
//   }
// ]
const scores = outputs.iou_scores;
console.log(scores);
// Tensor {
//   dims: [ 1, 1, 3 ],
//   type: 'float32',
//   data: Float32Array(3) [
//     1.0122944116592407,
//     0.9184409976005554,
//     0.9796935319900513
//   ],
//   size: 3
// }

Advanced Usage

You can then visualize the generated mask with:

const image = RawImage.fromTensor(masks[0][0].mul(255));
image.save('mask.png');

image/png

Next, select the channel with the highest IoU score, which in this case is the first (red) channel. Intersecting this with the original image gives us an isolated version of the subject:

image/gif

📚 Documentation

We've also got an online demo, which you can try out here.

🔧 Technical Details

Note: Having a separate repo for ONNX weights is intended to be a temporary solution until WebML gains more traction. If you would like to make your models web - ready, we recommend converting to ONNX using 🤗 Optimum and structuring your repo like this one (with ONNX weights located in a subfolder named onnx).

Property	Details
Model Type	Adapted `facebook/sam-vit-large` with ONNX weights for Transformers.js
Training Data	Not specified

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご