🚀 CLIP-ViT-Base-Patch16與Transformers.js適配項目
本項目將 openai/clip-vit-base-patch16 模型轉換為ONNX權重,以適配Transformers.js庫,方便在Web環境中使用。
🚀 快速開始
安裝依賴
如果你還沒有安裝 Transformers.js JavaScript庫,可以使用以下命令從 NPM 進行安裝:
npm i @xenova/transformers
💻 使用示例
基礎用法
使用pipeline
API進行零樣本圖像分類
const classifier = await pipeline('zero-shot-image-classification', 'Xenova/clip-vit-base-patch16');
const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg';
const output = await classifier(url, ['tiger', 'horse', 'dog']);
高級用法
使用CLIPModel
進行零樣本圖像分類
import { AutoTokenizer, AutoProcessor, CLIPModel, RawImage } from '@xenova/transformers';
const tokenizer = await AutoTokenizer.from_pretrained('Xenova/clip-vit-base-patch16');
const processor = await AutoProcessor.from_pretrained('Xenova/clip-vit-base-patch16');
const model = await CLIPModel.from_pretrained('Xenova/clip-vit-base-patch16');
const texts = ['a photo of a car', 'a photo of a football match'];
const text_inputs = tokenizer(texts, { padding: true, truncation: true });
const image = await RawImage.read('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/football-match.jpg');
const image_inputs = await processor(image);
const output = await model({ ...text_inputs, ...image_inputs });
使用CLIPTextModelWithProjection
計算文本嵌入
import { AutoTokenizer, CLIPTextModelWithProjection } from '@xenova/transformers';
const tokenizer = await AutoTokenizer.from_pretrained('Xenova/clip-vit-base-patch16');
const text_model = await CLIPTextModelWithProjection.from_pretrained('Xenova/clip-vit-base-patch16');
const texts = ['a photo of a car', 'a photo of a football match'];
const text_inputs = tokenizer(texts, { padding: true, truncation: true });
const { text_embeds } = await text_model(text_inputs);
使用CLIPVisionModelWithProjection
計算視覺嵌入
import { AutoProcessor, CLIPVisionModelWithProjection, RawImage } from '@xenova/transformers';
const processor = await AutoProcessor.from_pretrained('Xenova/clip-vit-base-patch16');
const vision_model = await CLIPVisionModelWithProjection.from_pretrained('Xenova/clip-vit-base-patch16');
const image = await RawImage.read('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/football-match.jpg');
const image_inputs = await processor(image);
const { image_embeds } = await vision_model(image_inputs);
📚 詳細文檔
注意事項
⚠️ 重要提示
為ONNX權重單獨創建一個倉庫是一種臨時解決方案,直到WebML獲得更廣泛的應用。如果你想讓你的模型適用於Web環境,我們建議使用 🤗 Optimum 將模型轉換為ONNX格式,並按照本倉庫的結構進行組織(將ONNX權重放在名為onnx
的子文件夾中)。