MedCSP_clip Open-Source Medical Image Classification Model - Freely Achieve Precise Zero-Shot Classification of Medical Images

Medcsp Clip

Developed by xcwangpsu

A zero-shot medical image classification model based on the CLIP architecture

Text-to-Image Open Source License:MIT #Medical Image Classification #Zero-shot Learning #Multimodal Understanding

Downloads 91

Release Time : 9/10/2024

Model Overview

This model is a variant of the OpenAI CLIP architecture, specifically optimized for medical image classification tasks. It enables zero-shot image classification, meaning it can classify new categories without task-specific training.

Model Features

Medical Domain Optimization

Specially optimized for medical imaging characteristics, suitable for processing medical image data

Zero-shot Learning

Capable of classifying new categories without task-specific training

Multimodal Understanding

Can simultaneously understand image and text information, establishing vision-language associations

Model Capabilities

Medical Image Classification

Cross-modal Retrieval

Zero-shot Learning

Use Cases

Medical Imaging Analysis

Medical Image Classification

Classification and recognition of medical images such as X-rays and CT scans

Pathological Image Analysis

Identifying abnormal tissues in pathological slides

Medical Research

Medical Image Retrieval

Retrieving relevant medical images based on text descriptions

🚀 MedCSP_clip Model Card

This project provides a demonstration of using CLIP for encoding, which is useful for zero - shot image classification tasks.

🚀 Quick Start

Here is a demo of how to utilize the CLIP for encoding:

from open_clip import create_model_from_pretrained, get_tokenizer
import torch
from urllib.request import urlopen
from PIL import Image

# import model, processor and tokenizer
model, processor = create_model_from_pretrained('hf-hub:xcwangpsu/MedCSP_clip')
tokenizer = get_tokenizer('hf-hub:xcwangpsu/MedCSP_clip')

# encode image:

# import raw radiological image:
image = Image.open(urlopen("https://huggingface.co/xcwangpsu/MedCSP_clip/resolve/main/image_sample.jpg"))

# preprocess the image, the final tensor should have 4 dimensions (B, C, H, W)
processed_image = processor(image)
processed_image = torch.unsqueeze(processed_image, 0)
print("Input size:", processed_image.shape)

# encode to a single embedding
image_embedding = model.encode_image(processed_image)
print("Individual image embedding size:",image_embedding.shape)

# sequential encoding
seq_image_embedding = model.visual.trunk.forward_features(processed_image)
print("Sequential image embedding size:",seq_image_embedding.shape)

# encode text:

text = "Chest X-ray reveals increased lung opacity, indicating potential fluid buildup or infection."
tokens = tokenizer(text)

# encode to a single embedding
text_embedding = model.encode_text(tokens)
print("Individual text embedding size:",text_embedding.shape)

# sequential encoding
seq_text_embedding = model.text.transformer(tokens, output_hidden_states=True).hidden_states[-1]
print("Sequential text embedding size:", seq_text_embedding.shape)

💻 Usage Examples

Basic Usage

from open_clip import create_model_from_pretrained, get_tokenizer
import torch
from urllib.request import urlopen
from PIL import Image

# import model, processor and tokenizer
model, processor = create_model_from_pretrained('hf-hub:xcwangpsu/MedCSP_clip')
tokenizer = get_tokenizer('hf-hub:xcwangpsu/MedCSP_clip')

# encode image:

# import raw radiological image:
image = Image.open(urlopen("https://huggingface.co/xcwangpsu/MedCSP_clip/resolve/main/image_sample.jpg"))

# preprocess the image, the final tensor should have 4 dimensions (B, C, H, W)
processed_image = processor(image)
processed_image = torch.unsqueeze(processed_image, 0)
print("Input size:", processed_image.shape)

# encode to a single embedding
image_embedding = model.encode_image(processed_image)
print("Individual image embedding size:",image_embedding.shape)

# sequential encoding
seq_image_embedding = model.visual.trunk.forward_features(processed_image)
print("Sequential image embedding size:",seq_image_embedding.shape)

# encode text:

text = "Chest X-ray reveals increased lung opacity, indicating potential fluid buildup or infection."
tokens = tokenizer(text)

# encode to a single embedding
text_embedding = model.encode_text(tokens)
print("Individual text embedding size:",text_embedding.shape)

# sequential encoding
seq_text_embedding = model.text.transformer(tokens, output_hidden_states=True).hidden_states[-1]
print("Sequential text embedding size:", seq_text_embedding.shape)

Advanced Usage

# You can use the encoded embeddings for more complex tasks such as zero - shot image classification.
# For example, you can calculate the similarity between image embeddings and text embeddings.
from open_clip import create_model_from_pretrained, get_tokenizer
import torch
from urllib.request import urlopen
from PIL import Image

model, processor = create_model_from_pretrained('hf-hub:xcwangpsu/MedCSP_clip')
tokenizer = get_tokenizer('hf-hub:xcwangpsu/MedCSP_clip')

image = Image.open(urlopen("https://huggingface.co/xcwangpsu/MedCSP_clip/resolve/main/image_sample.jpg"))
processed_image = processor(image)
processed_image = torch.unsqueeze(processed_image, 0)
image_embedding = model.encode_image(processed_image)

text = "Chest X-ray reveals increased lung opacity, indicating potential fluid buildup or infection."
tokens = tokenizer(text)
text_embedding = model.encode_text(tokens)

# Calculate cosine similarity
cos_sim = torch.nn.functional.cosine_similarity(image_embedding, text_embedding)
print("Cosine similarity:", cos_sim)

📄 License

This project is licensed under the MIT license.

Acknowledgement

If you find any sources provided in this repo or our paper are useful, please cite our paper using this BibTex:

@inproceedings{wang2024unity,
  title={Unity in Diversity: Collaborative Pre-training Across Multimodal Medical Sources},
  author={Wang, Xiaochen and Luo, Junyu and Wang, Jiaqi and Zhong, Yuan and Zhang, Xiaokun and Wang, Yaqing and Bhatia, Parminder and Xiao, Cao and Ma, Fenglong},
  booktitle={Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
  pages={3644--3656},
  year={2024}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご