MegaDescriptor-L-224開源圖像特徵模型 - 精準助力動物重識別任務

首頁

Megadescriptor L 224

由BVRA開發

MegaDescriptor-L-224是一個基於Swin-L架構的圖像特徵模型，專為動物重識別任務設計，由Supervisely在動物重識別數據集上進行了預訓練。

圖像分類

PyTorch

#動物重識別 #Swin-L架構 #高維特徵提取

下載量 1,181

發布時間 : 11/6/2023

模型概述

該模型主要用於生成圖像嵌入向量，適用於動物重識別相關任務，能夠有效提取圖像特徵用於後續識別和匹配。

模型特點

高效特徵提取

基於Swin-L架構，能夠高效提取圖像特徵，適用於動物重識別任務。

大規模預訓練

在多個動物重識別數據集上進行預訓練，具有強大的泛化能力。

高分辨率處理

支持224x224像素的圖像輸入，能夠處理高分辨率圖像。

模型能力

圖像特徵提取

動物重識別

圖像嵌入生成

使用案例

野生動物保護

動物個體識別

用於識別和追蹤野生動物個體，支持保護和研究工作。

寵物管理

寵物身份識別

用於識別寵物個體，支持寵物管理和尋回服務。

🚀 MegaDescriptor-L-224模型卡片

MegaDescriptor-L-224是一個基於Swin-L架構的圖像特徵模型，由Supervisely在動物重識別數據集上進行了預訓練，可有效用於動物重識別相關任務。

🚀 快速開始

本模型可用於生成圖像嵌入向量。以下是一個簡單的使用示例，展示瞭如何使用torch和timm庫加載模型並處理圖像：

import timm
import torch
import torchvision.transforms as T

from PIL import Image
from urllib.request import urlopen

model = timm.create_model("hf-hub:BVRA/MegaDescriptor-L-224", pretrained=True)
model = model.eval()

train_transforms = T.Compose([T.Resize(224), 
                              T.ToTensor(), 
                              T.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]) 

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

output = model(train_transforms(img).unsqueeze(0))  # output is (batch_size, num_features) shaped tensor
# output is a (1, num_features) shaped tensor

✨ 主要特性

模型類型：動物重識別/特徵骨幹網絡
模型統計信息：
- 參數數量（百萬）：228.6
- 圖像尺寸：224 x 224
- 架構：swin_large_patch4_window7_224
關聯論文：
預訓練數據集：所有可用的重識別數據集 --> WildlifeDatasets

💻 使用示例

基礎用法

以下代碼展示瞭如何使用該模型生成圖像嵌入向量：

import timm
import torch
import torchvision.transforms as T

from PIL import Image
from urllib.request import urlopen

model = timm.create_model("hf-hub:BVRA/MegaDescriptor-L-224", pretrained=True)
model = model.eval()

train_transforms = T.Compose([T.Resize(224), 
                              T.ToTensor(), 
                              T.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]) 

img = Image.open(urlopen(
    'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

output = model(train_transforms(img).unsqueeze(0))  # output is (batch_size, num_features) shaped tensor
# output is a (1, num_features) shaped tensor

📄 許可證

本項目採用CC BY-NC 4.0許可證。

📚 引用

如果您使用了本模型，請引用以下論文：

@inproceedings{vcermak2024wildlifedatasets,
  title={WildlifeDatasets: An open-source toolkit for animal re-identification},
  author={{\v{C}}erm{\'a}k, Vojt{\v{e}}ch and Picek, Lukas and Adam, Luk{\'a}{\v{s}} and Papafitsoros, Kostas},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={5953--5963},
  year={2024}
}