llava-med-v1.5-mistral-7b-chest-xray開源模型 - 助力胸部X光分析與肺炎檢測

首頁

Llava Med V1.5 Mistral 7b Chest Xray

由YuchengShi開發

基於LLaVA-Med v1.5 Mistral-7B微調的多模態基礎模型，專為分析胸部X光圖像和檢測肺炎而優化

圖像生成文本

Transformers

開源協議:MIT #肺炎X光診斷 #自合成數據增強 #可解釋AI

下載量 466

發布時間 : 2/20/2025

模型概述

本模型是針對醫學影像優化的多模態基礎模型，特別擅長分析胸部X光圖像並檢測肺炎。通過自合成數據增強可解釋性，提供詳細的診斷見解。

模型特點

自合成數據增強

通過生成人類可理解的診斷見解增強模型的可解釋性

領域特定微調

針對醫學影像特別優化，實現準確的肺炎分類

迭代訓練

利用拒絕採樣提升診斷準確性和解釋質量

模型能力

胸部X光圖像分析

肺炎檢測

生成診斷解釋

多模態理解

使用案例

醫療診斷

肺炎篩查

分析胸部X光圖像，檢測是否存在肺炎

提供準確的肺炎分類和診斷解釋

醫學教育

作為教學工具展示X光片分析過程

生成詳細的診斷見解幫助學習

🚀 LLaVA-Med v1.5 Mistral用於胸部X光分析

本項目是一個基於多模態基礎模型的優化方案，專門針對胸部X光圖像進行分析，並利用Kaggle上的“胸部X光圖像（肺炎）數據集”來檢測肺炎。它能為肺炎診斷提供詳細且可解釋的輸出，助力醫療診斷工作。

🚀 快速開始

本模型是基於 LLaVA-Med v1.5 Mistral-7B 微調的多模態基礎模型，可用於分析胸部X光圖像並檢測肺炎。

項目頁面：SelfSynthX

論文鏈接：使用自合成數據增強多模態基礎模型的認知和可解釋性

✨ 主要特性

基礎模型：LLaVA-Med v1.5 Mistral-7B
數據集：胸部X光圖像（肺炎）
創新點：
- 自合成數據：通過生成人類可理解的診斷見解，增強了模型的可解釋性。
- 特定領域微調：在醫學影像上進行優化，以實現準確的肺炎分類。
- 迭代訓練：利用拒絕採樣來提高診斷準確性和解釋質量。
預期用途：輔助從胸部X光圖像中進行肺炎診斷，並提供詳細且可解釋的輸出。

📦 安裝指南

使用此模型前，請確保你已經安裝了必要的庫，如requests、Pillow、torch和transformers。你可以使用以下命令進行安裝：

pip install requests pillow torch transformers

💻 使用示例

基礎用法

import requests
from PIL import Image
import torch
from transformers import AutoProcessor, LlavaForConditionalGeneration

model_id = "YuchengShi/llava-med-v1.5-mistral-7b-chest-xray"
model = LlavaForConditionalGeneration.from_pretrained(
    model_id, 
    torch_dtype=torch.float16, 
    low_cpu_mem_usage=True,
).to("cuda")
processor = AutoProcessor.from_pretrained(model_id)

conversation = [
    {
      "role": "user",
      "content": [
          {"type": "text", "text": "Can you analyze this chest X-ray?"},
          {"type": "image"},
        ],
    },
]
prompt = processor.apply_chat_template(conversation, add_generation_prompt=True)
image_file = "chest-xray/test1.png"
raw_image = Image.open(requests.get(image_file, stream=True).raw)
inputs = processor(images=raw_image, text=prompt, return_tensors='pt').to("cuda", torch.float16)

output = model.generate(**inputs, max_new_tokens=200, do_sample=False)
print(processor.decode(output[0][2:], skip_special_tokens=True))

🔧 技術細節

訓練：使用LoRA在胸部X光圖像（肺炎數據集）上進行微調，並採用迭代拒絕採樣。
評估：實現了穩健的肺炎分類，並提供可解釋的診斷說明。

📄 許可證

本項目採用MIT許可證。

📚 詳細文檔

引用信息

如果你使用了此模型，請引用以下文獻：

@inproceedings{
  shi2025enhancing,
  title={Enhancing Cognition and Explainability of Multimodal Foundation Models with Self-Synthesized Data},
  author={Yucheng Shi and Quanzheng Li and Jin Sun and Xiang Li and Ninghao Liu},
  booktitle={The Thirteenth International Conference on Learning Representations},
  year={2025},
  url={https://openreview.net/forum?id=lHbLpwbEyt}
}