MedCSP_clip开源医学图像分类模型 - 免费实现零样本医学图像精准分类

首页

Medcsp Clip

由 xcwangpsu 开发

基于CLIP架构的医学领域零样本图像分类模型

文本生成图像开源协议:MIT #医疗图像分类 #零样本学习 #多模态理解

下载量 91

发布时间 : 9/10/2024

模型简介

该模型是基于OpenAI CLIP架构的变体，专门针对医学图像分类任务进行了优化。它能够实现零样本图像分类，即无需特定任务的训练即可对新类别进行分类。

模型特点

医学领域优化

针对医学图像特点进行了专门优化，适合处理医学影像数据

零样本学习

无需特定任务的训练即可对新类别进行分类

多模态理解

能够同时理解图像和文本信息，建立视觉-语言关联

模型能力

医学图像分类

跨模态检索

零样本学习

使用案例

医疗影像分析

医学影像分类

对X光、CT等医学影像进行分类识别

病理图像分析

识别病理切片中的异常组织

医学研究

医学图像检索

根据文本描述检索相关医学图像

🚀 MedCSP_clip模型卡

MedCSP_clip是一个用于零样本图像分类的模型，基于CLIP技术，可实现图像和文本的编码，在医学图像分析等领域有重要应用价值。

🚀 快速开始

这里展示了如何使用CLIP进行编码的示例：

from open_clip import create_model_from_pretrained, get_tokenizer
import torch
from urllib.request import urlopen
from PIL import Image

# import model, processor and tokenizer
model, processor = create_model_from_pretrained('hf-hub:xcwangpsu/MedCSP_clip')
tokenizer = get_tokenizer('hf-hub:xcwangpsu/MedCSP_clip')

# encode image:

# import raw radiological image:
image = Image.open(urlopen("https://huggingface.co/xcwangpsu/MedCSP_clip/resolve/main/image_sample.jpg"))

# preprocess the image, the final tensor should have 4 dimensions (B, C, H, W)
processed_image = processor(image)
processed_image = torch.unsqueeze(processed_image, 0)
print("Input size:", processed_image.shape)

# encode to a single embedding
image_embedding = model.encode_image(processed_image)
print("Individual image embedding size:",image_embedding.shape)

# sequential encoding
seq_image_embedding = model.visual.trunk.forward_features(processed_image)
print("Sequential image embedding size:",seq_image_embedding.shape)

# encode text:

text = "Chest X-ray reveals increased lung opacity, indicating potential fluid buildup or infection."
tokens = tokenizer(text)

# encode to a single embedding
text_embedding = model.encode_text(tokens)
print("Individual text embedding size:",text_embedding.shape)

# sequential encoding
seq_text_embedding = model.text.transformer(tokens, output_hidden_states=True).hidden_states[-1]
print("Sequential text embedding size:", seq_text_embedding.shape)

💻 使用示例

基础用法

from open_clip import create_model_from_pretrained, get_tokenizer
import torch
from urllib.request import urlopen
from PIL import Image

# import model, processor and tokenizer
model, processor = create_model_from_pretrained('hf-hub:xcwangpsu/MedCSP_clip')
tokenizer = get_tokenizer('hf-hub:xcwangpsu/MedCSP_clip')

# encode image:

# import raw radiological image:
image = Image.open(urlopen("https://huggingface.co/xcwangpsu/MedCSP_clip/resolve/main/image_sample.jpg"))

# preprocess the image, the final tensor should have 4 dimensions (B, C, H, W)
processed_image = processor(image)
processed_image = torch.unsqueeze(processed_image, 0)
print("Input size:", processed_image.shape)

# encode to a single embedding
image_embedding = model.encode_image(processed_image)
print("Individual image embedding size:",image_embedding.shape)

# sequential encoding
seq_image_embedding = model.visual.trunk.forward_features(processed_image)
print("Sequential image embedding size:",seq_image_embedding.shape)


# encode text:

text = "Chest X-ray reveals increased lung opacity, indicating potential fluid buildup or infection."
tokens = tokenizer(text)

# encode to a single embedding
text_embedding = model.encode_text(tokens)
print("Individual text embedding size:",text_embedding.shape)

# sequential encoding
seq_text_embedding = model.text.transformer(tokens, output_hidden_states=True).hidden_states[-1]
print("Sequential text embedding size:", seq_text_embedding.shape)

高级用法

# 此代码示例展示了如何对图像和文本进行编码，可根据实际需求在医学图像分析等场景中灵活运用。
from open_clip import create_model_from_pretrained, get_tokenizer
import torch
from urllib.request import urlopen
from PIL import Image

# import model, processor and tokenizer
model, processor = create_model_from_pretrained('hf-hub:xcwangpsu/MedCSP_clip')
tokenizer = get_tokenizer('hf-hub:xcwangpsu/MedCSP_clip')

# encode image:

# import raw radiological image:
image = Image.open(urlopen("https://huggingface.co/xcwangpsu/MedCSP_clip/resolve/main/image_sample.jpg"))

# preprocess the image, the final tensor should have 4 dimensions (B, C, H, W)
processed_image = processor(image)
processed_image = torch.unsqueeze(processed_image, 0)
print("Input size:", processed_image.shape)

# encode to a single embedding
image_embedding = model.encode_image(processed_image)
print("Individual image embedding size:",image_embedding.shape)

# sequential encoding
seq_image_embedding = model.visual.trunk.forward_features(processed_image)
print("Sequential image embedding size:",seq_image_embedding.shape)


# encode text:

text = "Chest X-ray reveals increased lung opacity, indicating potential fluid buildup or infection."
tokens = tokenizer(text)

# encode to a single embedding
text_embedding = model.encode_text(tokens)
print("Individual text embedding size:",text_embedding.shape)

# sequential encoding
seq_text_embedding = model.text.transformer(tokens, output_hidden_states=True).hidden_states[-1]
print("Sequential text embedding size:", seq_text_embedding.shape)

📄 许可证

本项目采用MIT许可证。

致谢

如果您发现本仓库或我们论文中提供的任何资源有用，请使用以下BibTex引用我们的论文：

@inproceedings{wang2024unity,
  title={Unity in Diversity: Collaborative Pre-training Across Multimodal Medical Sources},
  author={Wang, Xiaochen and Luo, Junyu and Wang, Jiaqi and Zhong, Yuan and Zhang, Xiaokun and Wang, Yaqing and Bhatia, Parminder and Xiao, Cao and Ma, Fenglong},
  booktitle={Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
  pages={3644--3656},
  year={2024}
}