🚀 CLIP-ViT-Base-Patch16与Transformers.js适配项目
本项目将 openai/clip-vit-base-patch16 模型转换为ONNX权重,以适配Transformers.js库,方便在Web环境中使用。
🚀 快速开始
安装依赖
如果你还没有安装 Transformers.js JavaScript库,可以使用以下命令从 NPM 进行安装:
npm i @xenova/transformers
💻 使用示例
基础用法
使用pipeline
API进行零样本图像分类
const classifier = await pipeline('zero-shot-image-classification', 'Xenova/clip-vit-base-patch16');
const url = 'https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/tiger.jpg';
const output = await classifier(url, ['tiger', 'horse', 'dog']);
高级用法
使用CLIPModel
进行零样本图像分类
import { AutoTokenizer, AutoProcessor, CLIPModel, RawImage } from '@xenova/transformers';
const tokenizer = await AutoTokenizer.from_pretrained('Xenova/clip-vit-base-patch16');
const processor = await AutoProcessor.from_pretrained('Xenova/clip-vit-base-patch16');
const model = await CLIPModel.from_pretrained('Xenova/clip-vit-base-patch16');
const texts = ['a photo of a car', 'a photo of a football match'];
const text_inputs = tokenizer(texts, { padding: true, truncation: true });
const image = await RawImage.read('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/football-match.jpg');
const image_inputs = await processor(image);
const output = await model({ ...text_inputs, ...image_inputs });
使用CLIPTextModelWithProjection
计算文本嵌入
import { AutoTokenizer, CLIPTextModelWithProjection } from '@xenova/transformers';
const tokenizer = await AutoTokenizer.from_pretrained('Xenova/clip-vit-base-patch16');
const text_model = await CLIPTextModelWithProjection.from_pretrained('Xenova/clip-vit-base-patch16');
const texts = ['a photo of a car', 'a photo of a football match'];
const text_inputs = tokenizer(texts, { padding: true, truncation: true });
const { text_embeds } = await text_model(text_inputs);
使用CLIPVisionModelWithProjection
计算视觉嵌入
import { AutoProcessor, CLIPVisionModelWithProjection, RawImage } from '@xenova/transformers';
const processor = await AutoProcessor.from_pretrained('Xenova/clip-vit-base-patch16');
const vision_model = await CLIPVisionModelWithProjection.from_pretrained('Xenova/clip-vit-base-patch16');
const image = await RawImage.read('https://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/football-match.jpg');
const image_inputs = await processor(image);
const { image_embeds } = await vision_model(image_inputs);
📚 详细文档
注意事项
⚠️ 重要提示
为ONNX权重单独创建一个仓库是一种临时解决方案,直到WebML获得更广泛的应用。如果你想让你的模型适用于Web环境,我们建议使用 🤗 Optimum 将模型转换为ONNX格式,并按照本仓库的结构进行组织(将ONNX权重放在名为onnx
的子文件夹中)。