pmc_vit-l-14_hf開源視覺語言模型 - 基於特定數據集微調助力圖文關聯應用

首頁

Pmc Vit L 14 Hf

由ryanyip7777開發

基於CLIP-ViT-L/14在PMC-OA數據集上微調的視覺語言模型

文本生成圖像

Transformers

#醫學圖像-文本對齊 #PMC文獻適配 #多模態檢索

下載量 260

發布時間 : 9/7/2023

模型概述

該模型是OpenAI CLIP-ViT-L/14的微調版本，專門針對生物醫學文獻圖像-文本匹配任務進行了優化。

模型特點

生物醫學領域優化

在PMC-OA生物醫學文獻數據集上微調，增強了處理醫學圖像和文本的能力

多模態理解

能夠同時處理圖像和文本輸入，理解兩者之間的語義關係

模型能力

圖像特徵提取

文本特徵提取

跨模態相似度計算

圖像-文本匹配

使用案例

醫學研究

醫學文獻圖像檢索

根據文本描述檢索相關醫學圖像

醫學圖像標註

為醫學圖像生成描述性文本

🚀 clip-vit-l-14-pmc-finetuned

本模型是 openai/clip-vit-large-patch14 在 pmc_oa (https://huggingface.co/datasets/axiong/pmc_oa) 數據集上的微調版本。它在評估集上取得了以下結果：

損失值：1.0125

🚀 快速開始

微調模型

可以使用 run_clip.py (https://github.com/huggingface/transformers/tree/main/examples/pytorch/contrastive-image-text) 腳本對該模型進行微調，示例命令如下：

python -W ignore run_clip.py --model_name_or_path openai/clip-vit-large-patch14 \
      --output_dir ./clip-vit-l-14-pmc-finetuned \
      --train_file data/pmc_roco_train.csv \
      --validation_file data/pmc_roco_valid.csv \
      --image_column image --caption_column caption \
      --max_seq_length 77 \
      --do_train --do_eval \
      --per_device_train_batch_size 16 --per_device_eval_batch_size 8 \
      --remove_unused_columns=False \
      --learning_rate="5e-5" --warmup_steps="0" --weight_decay 0.1 \
      --overwrite_output_dir  \
      --num_train_epochs 10 \
      --logging_dir ./pmc_vit_logs \
      --save_total_limit 2 \
      --report_to  tensorboard

模型使用

以下是使用該模型的示例代碼：

from PIL import Image
import requests

from transformers import CLIPProcessor, CLIPModel

model = CLIPModel.from_pretrained("ryanyip7777/pmc_vit-l-14_hf")
processor = CLIPProcessor.from_pretrained("ryanyip7777/pmc_vit-l-14_hf")

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

inputs = processor(text=["a photo of a cat", "a photo of a dog"], images=image, return_tensors="pt", padding=True)

outputs = model(**inputs)
logits_per_image = outputs.logits_per_image # 這是圖像 - 文本相似度得分
probs = logits_per_image.softmax(dim=1) # 可以使用softmax函數得到標籤概率