Prompt2MedImage Open Source Text-to-Image Model - Free Deployment for Generating High-Quality Medical Images

Prompt2medimage

Developed by Nihirc

Latent space text-to-image diffusion model fine-tuned specifically for medical image generation

Text-to-Image English#Medical image generation #Diffusion model #Clinical auxiliary diagnosis

Downloads 1,223

Release Time : 5/12/2023

Model Overview

This latent text-to-image diffusion model can generate high-quality medical images based on text prompts, using a fixed pretrained text encoder (CLIP ViT-L/14)

Model Features

Medical imaging specialization

Fine-tuned on the ROCO medical imaging dataset, optimized for the medical field

High-quality generation

Capable of generating high-quality images that match medical descriptions

Easy integration

Seamless integration with Hugging Face Diffusers library

Model Capabilities

Generate medical images from text descriptions

Generate X-ray images

Generate MRI images

Generate medical images of specific conditions

Use Cases

Medical education

Teaching case generation

Generate imaging materials of typical cases for medical students

Examples show imaging of post-polio syndrome, optic nerve glioma and other cases

Medical research

Data augmentation

Generate supplementary imaging data for rare case studies

🚀 Prompt2MedImage - Diffusion for Medical Images

Prompt2MedImage is a latent text-to-image diffusion model fine - tuned on medical images from the ROCO dataset, enabling high - quality medical image generation based on text prompts.

🚀 Quick Start

This latent text to image diffusion model can be used to generate high - quality medical images based on text prompts. The weights are intended to be used with the 🧨Diffusers library. This model was trained using Amazon SageMaker and the Hugging Face Deep Learning container.

Installation

pip install diffusers transformers

Usage

Running pipeline with default PNDM scheduler:

import torch
from diffusers import StableDiffusionPipeline

model_id = "Nihirc/Prompt2MedImage"
device = "cuda"

pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to(device)

prompt = "Showing the subtrochanteric fracture in the porotic bone."
image = pipe(prompt).images[0]  
    
image.save("porotic_bone_fracture.png")

✨ Features

High - Quality Image Generation: Generate high - quality medical images based on text prompts.
Fixed Pretrained Text Encoder: Uses a fixed, pretrained text encoder (CLIP ViT - L/14) as suggested in the Imagen paper.

📚 Documentation

Model Details

Property	Details
Developed by	Nihir Chadderwala
Model Type	Diffusion based text to medical image generation model
Language	English
License	wtfpl
Model Description	This latent text to image diffusion model can be used to generate high quality medical images based on text prompts. It uses a fixed, pretrained text encoder (CLIP ViT - L/14) as suggested in the Imagen paper.

Examples

The patient had residual paralysis of the hand after poliomyelitis. It was necessary to stabilize the thumb with reference to the index finger. This was accomplished by placing a graft from the bone bank between the first and second metacarpals. The roentgenogram shows the complete healing of the graft one year later.

hand

A 3 - year - old child with visual difficulties. Axial FLAIR image show a supra - sellar lesion extending to the temporal lobes along the optic tracts (arrows) with moderate mass effect, compatible with optic glioma. FLAIR hyperintensity is also noted in the left mesencephalon from additional tumoral involvement

3_tumor

Showing the subtrochanteric fracture in the porotic bone.

protic bone

📄 License

This model is open access and available to all, with a Do What the F*ck You want to public license further specifying rights and usage.

You can't use the model to deliberately produce nor share illegal or harmful outputs or content.
The author claims no rights on the outputs you generate, you are free to use them and are accountable for their use.
You may re - distribute the weights and use the model commercially and/or as a service.

🔧 Technical Details

This model was trained using Amazon SageMaker and the Hugging Face Deep Learning container.

📖 Citation

O. Pelka, S. Koitka, J. Rückert, F. Nensa, C.M. Friedrich,
"Radiology Objects in COntext (ROCO): A Multimodal Image Dataset".
MICCAI Workshop on Large - scale Annotation of Biomedical Data and Expert Label Synthesis (LABELS) 2018, September 16, 2018, Granada, Spain. Lecture Notes on Computer Science (LNCS), vol. 11043, pp. 180 - 189, Springer Cham, 2018.
doi: 10.1007/978 - 3 - 030 - 01364 - 6_20

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご