Quiltnet B 16 PMB

Developed by wisdomik

A multimodal foundation model based on ViT-B/16 visual encoder and PubMedBERT text encoder trained on the Quilt-1M pathology video dataset

Image-to-Text Open Source License:MIT #Pathology Zero-shot Classification #Multimodal Medical Analysis #CLIP Architecture Optimization

Downloads 513

Release Time : 6/20/2023

Model Overview

A vision-language model for zero-shot image classification, image-text retrieval, and other tasks, specifically optimized for pathological histological images

Model Features

Specialized for Pathology Images

Specifically trained for pathological histological images, excelling in medical image classification tasks

Zero-shot Classification Capability

Capable of classifying new categories of images without fine-tuning

Multimodal Understanding

Simultaneously understands image and text information, supporting cross-modal retrieval tasks

Model Capabilities

Zero-shot image classification

Pathological image analysis

Cross-modal image-text retrieval

Tissue phenotype recognition

Use Cases

Medical Diagnosis Assistance

Tissue Phenotype Analysis

Identifying different types of tissues in pathological slides, such as adipose tissue and necrotic tissue

Cancer Pathology Classification

Distinguishing between different types of cancer pathological slides, such as adenocarcinoma and squamous cell carcinoma

Medical Research

Pathological Image Retrieval

Retrieving relevant pathological images based on text descriptions

🚀 QuiltNet-B-16-PMB

QuiltNet-B-16-PMB is a vision - language foundation model that can perform various vision - language processing tasks, trained on the Quilt - 1M dataset from histopathology videos.

🚀 Quick Start

QuiltNet-B-32/PMB is a ViT-B/16 image tower and PubMedBERT text tower vision - language foundation model. It's trained on the Quilt-1M dataset curated from representative histopathology videos. It can handle various vision - language processing (VLP) tasks like cross - modal retrieval, image classification, and visual question answering. QuiltNet sets new state - of - the - art results on a wide range of standard datasets and significantly outperforms previous VLP approaches:

✨ Features

Direct Use

Zero - shot image classification, image and text retrieval, among others.

Downstream Use

Image classification and other image task fine - tuning, linear probe image classification, image generation guiding and conditioning, among others.

Intended Use

Primary Users: The primary intended users of these models are AI researchers.
Research Purposes: The model is for research communities. It aims to help researchers better understand and explore zero - shot, arbitrary image classification and can be used for interdisciplinary studies of the potential impact of such models.

Out - of - Scope Use Cases

Deployment: Any deployed use case of the model (commercial or not) is currently out of scope. Non - deployed use cases like image search in a constrained environment are not recommended without thorough in - domain testing with a specific, fixed class taxonomy.
Language Limitation: Since the model is trained and evaluated only in English, its use should be limited to English language use cases.

📦 Installation

No installation steps are provided in the original document.

📚 Documentation

Training Data

This model was trained with QUILT - 1M, an image - text dataset for histopathology. Curated from educational videos on Youtube, QUILT - 1M is the largest dataset for vision - language modeling in histopathology.

⚠️ Important Note

The motivation behind dataset creation is to democratize research and experimentation around large - scale multi - modal model training and handling of uncurated, large - scale histopathology datasets crawled from the public internet. Our recommendation is to use the dataset for research purposes.

Evaluation

Evaluation was done with code in the [CLIP Benchmark suite](https://github.com/LAION - AI/CLIP_benchmark). Results can be found in the paper for a list of varying histology tasks and datasets.

Disclaimer

It is important to note that the results obtained from this function are not intended to constitute medical advice or replace consultation with a qualified medical professional. The use of this function is solely at your own risk and should be consistent with applicable laws, regulations, and ethical considerations. We do not warrant or guarantee the accuracy, completeness, suitability, or usefulness of this function for any particular purpose, and we hereby disclaim any liability arising from any reliance placed on this function or any results obtained from its use.

Privacy

In accordance with the privacy policy of Youtube, only Video IDs data is redistributed by us. It is strictly prohibited to redistribute any content apart from the Video IDs. Any distribution carried out must adhere to the laws and regulations applicable in your jurisdiction, including export control laws and embargoes.

📄 License

This project is licensed under the MIT license.

📖 Citation

@misc{ikezogwo2023quilt1m,
      title={Quilt-1M: One Million Image-Text Pairs for Histopathology}, 
      author={Wisdom Oluchi Ikezogwo and Mehmet Saygin Seyfioglu and Fatemeh Ghezloo and Dylan Stefan Chan Geva and Fatwir Sheikh Mohammed and Pavan Kumar Anand and Ranjay Krishna and Linda Shapiro},
      year={2023},
      eprint={2306.11207},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご