MedCSP_clip開源醫學圖像分類模型 - 免費實現零樣本醫學圖像精準分類

首頁

Medcsp Clip

由xcwangpsu開發

基於CLIP架構的醫學領域零樣本圖像分類模型

文本生成圖像開源協議:MIT #醫療圖像分類 #零樣本學習 #多模態理解

下載量 91

發布時間 : 9/10/2024

模型概述

該模型是基於OpenAI CLIP架構的變體，專門針對醫學圖像分類任務進行了優化。它能夠實現零樣本圖像分類，即無需特定任務的訓練即可對新類別進行分類。

模型特點

醫學領域優化

針對醫學圖像特點進行了專門優化，適合處理醫學影像數據

零樣本學習

無需特定任務的訓練即可對新類別進行分類

多模態理解

能夠同時理解圖像和文本信息，建立視覺-語言關聯

模型能力

醫學圖像分類

跨模態檢索

零樣本學習

使用案例

醫療影像分析

醫學影像分類

對X光、CT等醫學影像進行分類識別

病理圖像分析

識別病理切片中的異常組織

醫學研究

醫學圖像檢索

根據文本描述檢索相關醫學圖像

🚀 MedCSP_clip模型卡

MedCSP_clip是一個用於零樣本圖像分類的模型，基於CLIP技術，可實現圖像和文本的編碼，在醫學圖像分析等領域有重要應用價值。

🚀 快速開始

這裡展示瞭如何使用CLIP進行編碼的示例：

from open_clip import create_model_from_pretrained, get_tokenizer
import torch
from urllib.request import urlopen
from PIL import Image

# import model, processor and tokenizer
model, processor = create_model_from_pretrained('hf-hub:xcwangpsu/MedCSP_clip')
tokenizer = get_tokenizer('hf-hub:xcwangpsu/MedCSP_clip')

# encode image:

# import raw radiological image:
image = Image.open(urlopen("https://huggingface.co/xcwangpsu/MedCSP_clip/resolve/main/image_sample.jpg"))

# preprocess the image, the final tensor should have 4 dimensions (B, C, H, W)
processed_image = processor(image)
processed_image = torch.unsqueeze(processed_image, 0)
print("Input size:", processed_image.shape)

# encode to a single embedding
image_embedding = model.encode_image(processed_image)
print("Individual image embedding size:",image_embedding.shape)

# sequential encoding
seq_image_embedding = model.visual.trunk.forward_features(processed_image)
print("Sequential image embedding size:",seq_image_embedding.shape)

# encode text:

text = "Chest X-ray reveals increased lung opacity, indicating potential fluid buildup or infection."
tokens = tokenizer(text)

# encode to a single embedding
text_embedding = model.encode_text(tokens)
print("Individual text embedding size:",text_embedding.shape)

# sequential encoding
seq_text_embedding = model.text.transformer(tokens, output_hidden_states=True).hidden_states[-1]
print("Sequential text embedding size:", seq_text_embedding.shape)

💻 使用示例

基礎用法

from open_clip import create_model_from_pretrained, get_tokenizer
import torch
from urllib.request import urlopen
from PIL import Image

# import model, processor and tokenizer
model, processor = create_model_from_pretrained('hf-hub:xcwangpsu/MedCSP_clip')
tokenizer = get_tokenizer('hf-hub:xcwangpsu/MedCSP_clip')

# encode image:

# import raw radiological image:
image = Image.open(urlopen("https://huggingface.co/xcwangpsu/MedCSP_clip/resolve/main/image_sample.jpg"))

# preprocess the image, the final tensor should have 4 dimensions (B, C, H, W)
processed_image = processor(image)
processed_image = torch.unsqueeze(processed_image, 0)
print("Input size:", processed_image.shape)

# encode to a single embedding
image_embedding = model.encode_image(processed_image)
print("Individual image embedding size:",image_embedding.shape)

# sequential encoding
seq_image_embedding = model.visual.trunk.forward_features(processed_image)
print("Sequential image embedding size:",seq_image_embedding.shape)


# encode text:

text = "Chest X-ray reveals increased lung opacity, indicating potential fluid buildup or infection."
tokens = tokenizer(text)

# encode to a single embedding
text_embedding = model.encode_text(tokens)
print("Individual text embedding size:",text_embedding.shape)

# sequential encoding
seq_text_embedding = model.text.transformer(tokens, output_hidden_states=True).hidden_states[-1]
print("Sequential text embedding size:", seq_text_embedding.shape)

高級用法

# 此代碼示例展示瞭如何對圖像和文本進行編碼，可根據實際需求在醫學圖像分析等場景中靈活運用。
from open_clip import create_model_from_pretrained, get_tokenizer
import torch
from urllib.request import urlopen
from PIL import Image

# import model, processor and tokenizer
model, processor = create_model_from_pretrained('hf-hub:xcwangpsu/MedCSP_clip')
tokenizer = get_tokenizer('hf-hub:xcwangpsu/MedCSP_clip')

# encode image:

# import raw radiological image:
image = Image.open(urlopen("https://huggingface.co/xcwangpsu/MedCSP_clip/resolve/main/image_sample.jpg"))

# preprocess the image, the final tensor should have 4 dimensions (B, C, H, W)
processed_image = processor(image)
processed_image = torch.unsqueeze(processed_image, 0)
print("Input size:", processed_image.shape)

# encode to a single embedding
image_embedding = model.encode_image(processed_image)
print("Individual image embedding size:",image_embedding.shape)

# sequential encoding
seq_image_embedding = model.visual.trunk.forward_features(processed_image)
print("Sequential image embedding size:",seq_image_embedding.shape)


# encode text:

text = "Chest X-ray reveals increased lung opacity, indicating potential fluid buildup or infection."
tokens = tokenizer(text)

# encode to a single embedding
text_embedding = model.encode_text(tokens)
print("Individual text embedding size:",text_embedding.shape)

# sequential encoding
seq_text_embedding = model.text.transformer(tokens, output_hidden_states=True).hidden_states[-1]
print("Sequential text embedding size:", seq_text_embedding.shape)

📄 許可證

本項目採用MIT許可證。

致謝

如果您發現本倉庫或我們論文中提供的任何資源有用，請使用以下BibTex引用我們的論文：

@inproceedings{wang2024unity,
  title={Unity in Diversity: Collaborative Pre-training Across Multimodal Medical Sources},
  author={Wang, Xiaochen and Luo, Junyu and Wang, Jiaqi and Zhong, Yuan and Zhang, Xiaokun and Wang, Yaqing and Bhatia, Parminder and Xiao, Cao and Ma, Fenglong},
  booktitle={Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
  pages={3644--3656},
  year={2024}
}