đ QuiltNet-B-32
QuiltNet-B-32 is a CLIP ViT-B/32 vision-language foundation model. It's trained on the Quilt-1M dataset, which is curated from representative histopathology videos. This model can handle various vision-language processing (VLP) tasks, such as cross-modal retrieval, image classification, and visual question answering. QuiltNet sets new state-of-the-art results on a wide range of standard datasets and significantly outperforms previous VLP approaches.

đ Quick Start
There's no specific quick - start content in the original document.
⨠Features
- Versatile VLP Tasks: Can perform cross - modal retrieval, image classification, and visual question answering.
- State - of - the - art Performance: Establishes new state - of - the - art in a wide range of standard datasets and outperforms prior VLP approaches.
đĻ Installation
There's no installation steps in the original document.
đģ Usage Examples
There are interactive examples on the model card:
- Tissue phenotyping:
- Image source: https://quilt1m.github.io/img/BREST092.jpg
- Candidate labels: adipose tissue, debris tissue, lymphocytes tissue, mucus tissue, smooth muscle tissue, normal colon mucosa tissue, cancer - associated stroma tissue, colorectal adenocarcinoma epithelium tissue
- Squamous cell carcinoma histopathology:
- Image source: [https://huggingface.co/microsoft/BiomedCLIP - PubMedBERT_256 - vit_base_patch16_224/resolve/main/example_data/biomed_image_classification_example_data/squamous_cell_carcinoma_histopathology.jpeg](https://huggingface.co/microsoft/BiomedCLIP - PubMedBERT_256 - vit_base_patch16_224/resolve/main/example_data/biomed_image_classification_example_data/squamous_cell_carcinoma_histopathology.jpeg)
- Candidate labels: adenocarcinoma histopathology, squamous cell carcinoma histopathology
- Adenocarcinoma histopathology:
- Image source: [https://huggingface.co/microsoft/BiomedCLIP - PubMedBERT_256 - vit_base_patch16_224/resolve/main/example_data/biomed_image_classification_example_data/adenocarcinoma_histopathology.jpg](https://huggingface.co/microsoft/BiomedCLIP - PubMedBERT_256 - vit_base_patch16_224/resolve/main/example_data/biomed_image_classification_example_data/adenocarcinoma_histopathology.jpg)
- Candidate labels: adenocarcinoma histopathology, squamous cell carcinoma histopathology
đ Documentation
Uses
Intended Use
This model is a research output for research communities. It aims to help researchers better understand and explore zero - shot, arbitrary image classification. It can also be used for interdisciplinary studies of the potential impact of such models.
- Primary intended users: AI researchers.
- Primary use scenarios: To understand the robustness, generalization, and other capabilities, biases, and constraints of computer vision histopathology models.
Direct Use
Zero - shot image classification, image and text retrieval, etc.
Downstream Use
Image classification and other image task fine - tuning, linear probe image classification, image generation guiding and conditioning, etc.
Out - of - Scope Use Cases
- Any deployed use case of the model (commercial or non - commercial) is currently out of scope.
- Non - deployed use cases like image search in a constrained environment are not recommended without thorough in - domain testing with a specific, fixed class taxonomy.
- Since the model is only trained and evaluated in English, its use should be limited to English language use cases.
Training Data
This model is trained with QUILT - 1M, an image - text dataset for histopathology. It's curated from educational videos on Youtube and is the largest dataset for vision - language modeling in histopathology.
â ī¸ Important Note
The dataset is created to democratize research and experimentation around large - scale multi - modal model training and handling of uncurated, large - scale histopathology datasets crawled from the public internet. It's recommended to use the dataset for research purposes.
Evaluation
Evaluation is done with code in the [CLIP Benchmark suite](https://github.com/LAION - AI/CLIP_benchmark), and the results can be found in the paper for a list of varying histology tasks and datasets.
Disclaimer
The results from this function are not intended to be medical advice or replace consultation with a qualified medical professional. Using this function is at your own risk and should comply with applicable laws, regulations, and ethical considerations. There's no warranty or guarantee for the accuracy, completeness, suitability, or usefulness of this function for any particular purpose, and no liability is assumed for any reliance on this function or its results.
Privacy
In line with Youtube's privacy policy, only Video IDs data is redistributed. Redistributing any content other than Video IDs is strictly prohibited. Any distribution must comply with the laws and regulations in your jurisdiction, including export control laws and embargoes.
đ§ Technical Details
There's no specific technical details content in the original document.
đ License
The model is released under the MIT license.
đ Citation
@misc{ikezogwo2023quilt1m,
title={Quilt-1M: One Million Image-Text Pairs for Histopathology},
author={Wisdom Oluchi Ikezogwo and Mehmet Saygin Seyfioglu and Fatemeh Ghezloo and Dylan Stefan Chan Geva and Fatwir Sheikh Mohammed and Pavan Kumar Anand and Ranjay Krishna and Linda Shapiro},
year={2023},
eprint={2306.11207},
archivePrefix={arXiv},
primaryClass={cs.CV}
}