模型简介
模型特点
模型能力
使用案例
🚀 Phikon-v2 模型卡片
Phikon-v2 是一个基于视觉变换器(Vision Transformer)的大型预训练模型。它采用 Dinov2 自监督方法,在 PANCAN-XL 数据集上进行预训练。PANCAN-XL 数据集包含 4.5 亿张 20 倍放大的组织学图像,这些图像从 6 万张全切片图像(WSI)中采样得到。PANCAN-XL 仅整合了公开可用的数据集,包括用于恶性组织的 CPTAC(6,193 张 WSI)和 TCGA(29,502 张 WSI),以及用于正常组织的 GTEx(13,302 张 WSI)。
与我们之前基于 iBOT 在来自 TCGA(6k WSI)的 4000 万张组织学图像上预训练的基础模型 Phikon 相比,Phikon-v2 在各种为生物标志物发现量身定制的弱监督任务上表现更优。为避免与 PANCAN-XL 预训练数据集发生数据污染,Phikon-v2 在外部队列上进行评估,并与一系列表征学习和基础模型进行了基准对比。
🚀 快速开始
模型描述
- 开发者:Owkin, Inc
- 模型类型:预训练视觉骨干网络(通过 DINOv2 实现的 ViT-L/16)
- 预训练数据集:PANCAN-XL,源自公共组织学数据集(TCGA、CPTAC、GTEx、TCIA 等)
- 论文:Arxiv
- 许可证:Owkin 非商业许可证
如何使用(特征提取)
以下代码片段展示了如何使用 Phikon-v2(CLS 标记)从组织学图像中提取特征。这些特征可用于下游应用,如感兴趣区域(ROI)分类(通过线性或 KNN 探测)、切片分类(通过多实例学习)、分割(例如通过 ViT-Adapter)等。
from PIL import Image
import torch
from transformers import AutoImageProcessor, AutoModel
# Load an image
image = Image.open(
requests.get(
"https://github.com/owkin/HistoSSLscaling/blob/main/assets/example.tif?raw=true",
stream=True
).raw
)
# Load phikon-v2
processor = AutoImageProcessor.from_pretrained("owkin/phikon-v2")
model = AutoModel.from_pretrained("owkin/phikon-v2")
model.eval()
# Process the image
inputs = processor(image, return_tensors="pt")
# Get the features
with torch.inference_mode():
outputs = model(**inputs)
features = outputs.last_hidden_state[:, 0, :] # (1, 1024) shape
assert features.shape == (1, 1024)
直接使用(使用预提取和冻结的特征)
Phikon-v2 可以在不同的下游应用中进行微调或不进行微调使用。例如,可以使用多实例学习算法(如 ABMIL)进行切片分类。
下游使用(微调)
你可以在切片级别的下游任务上微调该模型。这个 Colab 笔记本 允许你通过 Hugging Face API 使用 LoRA 微调 Phikon 和 Phikon-v2。
✨ 主要特性
- 强大的特征提取能力:基于 Vision Transformer 架构和 Dinov2 自监督方法,能够从组织学图像中提取高质量特征。
- 广泛的数据集支持:在大规模的 PANCAN-XL 数据集上预训练,该数据集整合了多个公开可用的组织学数据集。
- 优秀的下游任务表现:在各种弱监督任务上优于之前的模型,适用于生物标志物发现等应用。
📦 安装指南
软件依赖
Python 包
- torch>==2.0.0:https://pytorch.org
- torchvision>=0.15.0:https://pytorch.org/vision/stable/index.html
- xformers>=0.0.18:https://github.com/facebookresearch/xformers
代码仓库
- DINOv2(自监督学习):https://github.com/facebookresearch/dinov2
💻 使用示例
基础用法
from PIL import Image
import torch
from transformers import AutoImageProcessor, AutoModel
# Load an image
image = Image.open(
requests.get(
"https://github.com/owkin/HistoSSLscaling/blob/main/assets/example.tif?raw=true",
stream=True
).raw
)
# Load phikon-v2
processor = AutoImageProcessor.from_pretrained("owkin/phikon-v2")
model = AutoModel.from_pretrained("owkin/phikon-v2")
model.eval()
# Process the image
inputs = processor(image, return_tensors="pt")
# Get the features
with torch.inference_mode():
outputs = model(**inputs)
features = outputs.last_hidden_state[:, 0, :] # (1, 1024) shape
assert features.shape == (1, 1024)
高级用法
你可以使用 这个 Colab 笔记本 通过 Hugging Face API 使用 LoRA 微调 Phikon 和 Phikon-v2。
📚 详细文档
训练详情
- 训练数据:PANCAN-XL,一个由 4.56 亿张 [224×224]、20 倍分辨率的组织学图像组成的预训练数据集,这些图像从 6 万张 H&E WSI 中采样得到。
- 训练机制:使用 PyTorch-FSDP 混合精度的 fp16。
- 训练目标:采用 DINOv2 自监督学习方法,包含以下损失函数:
- 具有多裁剪的 DINO 自蒸馏损失
- iBOT 掩码图像建模损失
- 对 [CLS] 标记的 KoLeo 正则化
- 训练时长:10 万次迭代,批次大小为 4,096
- 模型架构:ViT-Large(0.3B 参数):补丁大小 16,嵌入维度 1024,16 个头,多层感知机前馈网络(MLP FFN)
- 使用的硬件:32×4 块 Nvidia V100 32GB GPU
- 训练总时长:约 4300 GPU 小时(总计 33 小时)
- 训练平台:法国超级集群 Jean-Zay
第三方许可证
视觉变换器架构源自 facebookresearch/dino(Apache 许可证 2.0)和 huggingface/pytorch-image-models(Apache 许可证 2.0)。此代码基于 DINOv2 代码仓库(Apache 许可证 2.0)构建。
属性 | 详情 |
---|---|
模型类型 | 预训练视觉骨干网络(通过 DINOv2 实现的 ViT-L/16) |
预训练数据集 | PANCAN-XL,源自公共组织学数据集(TCGA、CPTAC、GTEx、TCIA 等) |
论文 | Arxiv |
许可证 | Owkin 非商业许可证 |
预训练数据集许可证
🔧 技术细节
Phikon-v2 基于 Vision Transformer 架构,利用 Dinov2 自监督学习方法在大规模组织学数据集上进行预训练。通过结合多种损失函数,如 DINO 自蒸馏损失、iBOT 掩码图像建模损失和 KoLeo 正则化,模型能够学习到鲁棒的视觉特征。在训练过程中,采用了 PyTorch-FSDP 混合精度训练,以提高训练效率。
📄 许可证
本模型使用 Owkin 非商业许可证。
联系信息
如有任何额外问题或建议,请联系 Alexandre Filiot (alexandre.filiot@owkin.com
)。
如何引用
@misc{filiot2024phikonv2largepublicfeature,
title={Phikon-v2, A large and public feature extractor for biomarker prediction},
author={Alexandre Filiot and Paul Jacob and Alice Mac Kain and Charlie Saillard},
year={2024},
eprint={2409.09173},
archivePrefix={arXiv},
primaryClass={eess.IV},
url={https://arxiv.org/abs/2409.09173},
}
致谢
我们感谢 DINOv2 的作者们做出的杰出贡献 [1]。
计算资源
本研究获得了 IDRIS 高性能计算资源的支持,该支持由 GENCI 分配(2023 - A0141012519)。
数据集
本研究部分结果基于 TCGA 研究网络生成的数据:https://www.cancer.gov/tcga。基因型 - 组织表达(GTEx)项目由美国国立卫生研究院院长办公室共同基金以及 NCI、NHGRI、NHLBI、NIDA、NIMH 和 NINDS 支持。本研究中分析使用的数据于 2023 年 7 月 1 日从 GTEx 门户获取。
参考文献
- Oquab, M., Darcet, T., Moutakanni, T., Vo, H., Szafraniec, M., Khalidov, V., Fernandez, P., Haziza, D., Massa, F., El - Nouby, A., Assran, M., Ballas, N., Galuba, W., Howes, R., Huang, P. - Y., Li, S. - W., Misra, I., Rabbat, M., Sharma, V., Synnaeve, G., Xu, H., Jegou, H., Mairal, J., Labatut, P., Joulin, A., & Bojanowski, P. (2024). Dinov2: Learning robust visual features without supervision. arXiv.
- Clark, K., Vendt, B., Smith, K., Freymann, J., Kirby, J., Koppel, P., Moore, S., Phillips, S., Maffitt, D., Pringle, M., Tarbox, L., & Prior, F. (2013). The Cancer Imaging Archive (TCIA): Maintaining and operating a public information repository. Journal of Digital Imaging, 26(6), 1045–1057. Springer Science and Business Media LLC. https://doi.org/10.1007/s10278 - 013 - 9622 - 7
- National Cancer Institute Clinical Proteomic Tumor Analysis Consortium (CPTAC). (2019). The Clinical Proteomic Tumor Analysis Consortium Acute Myeloid Leukemia Collection (CPTAC - AML) (Version 4) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.2019.B6FOE619
- National Cancer Institute Clinical Proteomic Tumor Analysis Consortium (CPTAC). (2018). The Clinical Proteomic Tumor Analysis Consortium Glioblastoma Multiforme Collection (CPTAC - GBM) (Version 15) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2018.3RJE41Q1
- National Cancer Institute Clinical Proteomic Tumor Analysis Consortium (CPTAC). (2020). The Clinical Proteomic Tumor Analysis Consortium Breast Invasive Carcinoma Collection (CPTAC - BRCA) (Version 1) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.CAEM - YS80
- National Cancer Institute Clinical Proteomic Tumor Analysis Consortium (CPTAC). (2020). The Clinical Proteomic Tumor Analysis Consortium Colon Adenocarcinoma Collection (CPTAC - COAD) (Version 1) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.YZWQ - ZZ63
- National Cancer Institute Clinical Proteomic Tumor Analysis Consortium (CPTAC). (2018). The Clinical Proteomic Tumor Analysis Consortium Head and Neck Squamous Cell Carcinoma Collection (CPTAC - HNSCC) (Version 16) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2018.UW45NH81
- National Cancer Institute Clinical Proteomic Tumor Analysis Consortium (CPTAC). (2018). The Clinical Proteomic Tumor Analysis Consortium Clear Cell Renal Cell Carcinoma Collection (CPTAC - CCRCC) (Version 13) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2018.OBLAMN27
- National Cancer Institute Clinical Proteomic Tumor Analysis Consortium (CPTAC). (2018). The Clinical Proteomic Tumor Analysis Consortium Lung Squamous Cell Carcinoma Collection (CPTAC - LSCC) (Version 15) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2018.6EMUB5L2
- National Cancer Institute Clinical Proteomic Tumor Analysis Consortium (CPTAC). (2019). The Clinical Proteomic Tumor Analysis Consortium Sarcomas Collection (CPTAC - SAR) (Version 10) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.2019.9BT23R95
- National Cancer Institute Clinical Proteomic Tumor Analysis Consortium (CPTAC). (2020). The Clinical Proteomic Tumor Analysis Consortium Ovarian Serous Cystadenocarcinoma Collection (CPTAC - OV) (Version 3) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.ZS4A - JD58
- National Cancer Institute Clinical Proteomic Tumor Analysis Consortium (CPTAC). (2018). The Clinical Proteomic Tumor Analysis Consortium Pancreatic Ductal Adenocarcinoma Collection (CPTAC - PDA) (Version 14) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2018.SC20FO18
- National Cancer Institute Clinical Proteomic Tumor Analysis Consortium (CPTAC). (2018). The Clinical Proteomic Tumor Analysis Consortium Cutaneous Melanoma Collection (CPTAC - CM) (Version 11) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2018.ODU24GZE
- National Cancer Institute Clinical Proteomic Tumor Analysis Consortium (CPTAC). (2019). The Clinical Proteomic Tumor Analysis Consortium Uterine Corpus Endometrial Carcinoma Collection (CPTAC - UCEC) (Version 12) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2018.3R3JUISW
- Cancer Moonshot Biobank. (2022). Cancer Moonshot Biobank – Colorectal Cancer Collection (CMB - CRC) (Version 5) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/DJG7 - GZ87
- Cancer Moonshot Biobank. (2022). Cancer Moonshot Biobank – Melanoma Collection (CMB - MEL) (Version 5) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/GWSP - WH72
- Cancer Moonshot Biobank. (2022). Cancer Moonshot Biobank – Gastroesophageal Cancer Collection (CMB - GEC) (Version 2) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/E7KH - R486
- Cancer Moonshot Biobank. (2022). Cancer Moonshot Biobank – Lung Cancer Collection (CMB - LCA) (Version 5) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/3CX3 - S132
- Cancer Moonshot Biobank. (2022). Cancer Moonshot Biobank – Multiple Myeloma Collection (CMB - MML) (Version 4) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/SZKB - SW39
- Bakas, S., Sako, C., Akbari, H., Bilello, M., Sotiras, A., Shukla, G., Rudie, J. D., Flores Santamaria, N., Fathi Kazerooni, A., Pati, S., Rathore, S., Mamourian, E., Ha, S. M., Parker, W., Doshi, J., Baid, U., Bergman, M., Binder, Z. A., Verma, R., … Davatzikos, C. (2021). Multi - parametric magnetic resonance imaging (mpMRI) scans for de novo Glioblastoma (GBM) patients from the University of Pennsylvania Health System (UPENN - GBM) (Version 2) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.709X - DN49
- Martel, A. L., Nofech - Mozes, S., Salama, S., Akbar, S., & Peikari, M. (2019). Assessment of residual breast cancer cellularity after neoadjuvant chemotherapy using digital pathology [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.2019.4YIBTJNO
- Campanella, G., Hanna, M. G., Brogi, E., & Fuchs, T. J. (2019). Breast metastases to axillary lymph nodes [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.2019.3XBN2JCC
- Farahmand, S., Fernandez, A. I., Ahmed, F. S., Rimm, D. L., Chuang, J. H., Reisenbichler, E., & Zarringhalam, K. (2022). HER2 and trastuzumab treatment response H&E slides with tumor ROI annotations (Version 3) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/E65C - AM96
- Pataki, B. A., Olar, A., Ribli, D., Pesti, A., Kontsek, E., Gyongyosi, B., Bilecz, A., Kovács, T., Kovács, K. A., Kiss, Z., Szócska, M., Pollner, P., & Csabai, I. (2021). Digital pathological slides from Hungarian (Europe) colorectal cancer screening (Version 2) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.9CJF - 0127
- Pennycuick, A., Teixeira, V. H., AbdulJabbar, K., Raza, S. E. A., Lund, T., Akarca, A. U., Rosenthal, R., Kalinke, L., Chandrasekharan, D. P., Pipinikas, C. P., Lee - Six, H., Hynds, R. E., Gowers, K. H. C., Henry, J. Y., Millar, F. R., Hagos, Y. B., Denais, C., Falzon, M., Moore, D. A., Antoniou, S., Durrenberger, P. F., Furness, A. J., Carroll, B., Marceaux, C., Asselin - Labat, M. L., Larson, W., Betts, C., Coussens, L. M., Thakrar, R. M., George, J., Swanton, C., Thirlwell, C., Campbell, P. J., Marafioti, T., Yuan, Y., Quezada, S. A., McGranahan, N., & Janes, S. M. (2020). Immune surveillance in clinical regression of preinvasive squamous cell lung cancer. Cancer Discovery, 10(10), 1489 - 1499. https://doi.org/10.1158/2159 - 8290.CD - 19 - 1366
- National Lung Screening Trial Research Team. (2013). Data from the National Lung Screening Trial (NLST) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.HMQ8 - J677
- Wang, C. - W., Chang, C. - C., Lo, S. - C., Lin, Y. - J., Liou, Y. - A., Hsu, P. - C., Lee, Y. - C., & Chao, T. - K. (2021). A dataset of histopathological whole slide images for classification of treatment effectiveness to ovarian cancer (Ovarian Bevacizumab Response) (Version 2) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.985G - EY35
- Chowdhury, S., Kennedy, J. J., Ivey, R. G., Murillo, O., Hosseini, N., Song, X., Petralia, F., Calinawan, A., Voytovich, U. J., Savage, S. R., Berry, A., Reva, B., Ozbek, U., Krek, A., Ma, W., da Veiga Leprevost, F., Ji, J., Yoo, S., Lin, C., … Paulovich, A. G. (2023). Proteogenomic analysis of chemo - refractory high grade serous ovarian cancer (PTRC - HGSOC) (Version 1) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/6RDA - P940
- Hodis, E., Torlai Triglia, E., Kwon, J. Y. H., Biancalani, T., Zakka, L. R., Parkar, S., Hütter, J. C., Buffoni, L., Delorey, T. M., Phillips, D., Dionne, D., Nguyen, L. T., Schapiro, D., Maliga, Z., Jacobson, C. A., Hendel, A., Rozenblatt - Rosen, O., Mihm, M. C. Jr., Garraway, L. A., & Regev, A. (2022). Stepwise - edited, human melanoma models reveal mutations' effect on tumor and microenvironment. Science, 376(6592), eabi8175. https://doi.org/10.1126/science.abi8175









