🚀 醫學視覺問答模型 - Florence-2_FT_Lung-Cancer-detection
基於microsoft/Florence-2-base-ft微調的肺癌檢測模型,利用肺部圖像精準識別肺癌類型,為醫學診斷提供高效解決方案。
🚀 快速開始
安裝依賴
! pip install -q "flash_attn==2.6.3" "timm==1.0.8" "einops==0.8.0" "transformers==4.44.0"
環境配置
device = "cuda:0" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
加載模型和處理器
model = AutoModelForCausalLM.from_pretrained("nirusanan/Florence-2_FT_Lung-Cancer-detection", torch_dtype=torch_dtype, trust_remote_code=True).to(device)
processor = AutoProcessor.from_pretrained("nirusanan/Florence-2_FT_Lung-Cancer-detection", trust_remote_code=True)
運行示例
prompt = "<DocVQA>" + "What is the type of lung cancer?"
url = "https://www.uab.edu/news/images/ct_scan.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = processor(text=prompt, images=image, return_tensors="pt").to(device, torch_dtype)
generated_ids = model.generate(
input_ids=inputs["input_ids"],
pixel_values=inputs["pixel_values"],
max_new_tokens=1024,
do_sample=False,
num_beams=3
)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
parsed_answer = processor.post_process_generation(generated_text, task="<DocVQA>", image_size=(image.width, image.height))
print(parsed_answer)
✨ 主要特性
- 微調模型:該模型是microsoft/Florence-2-base-ft的微調版本,專門針對肺癌檢測任務進行優化。
- 視覺問答:支持視覺問答任務,可根據肺部圖像回答相關問題。
📦 安裝指南
運行以下命令安裝所需依賴:
! pip install -q "flash_attn==2.6.3" "timm==1.0.8" "einops==0.8.0" "transformers==4.44.0"
💻 使用示例
基礎用法
! pip install -q "flash_attn==2.6.3" "timm==1.0.8" "einops==0.8.0" "transformers==4.44.0"
device = "cuda:0" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
model = AutoModelForCausalLM.from_pretrained("nirusanan/Florence-2_FT_Lung-Cancer-detection", torch_dtype=torch_dtype, trust_remote_code=True).to(device)
processor = AutoProcessor.from_pretrained("nirusanan/Florence-2_FT_Lung-Cancer-detection", trust_remote_code=True)
prompt = "<DocVQA>" + "What is the type of lung cancer?"
url = "https://www.uab.edu/news/images/ct_scan.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = processor(text=prompt, images=image, return_tensors="pt").to(device, torch_dtype)
generated_ids = model.generate(
input_ids=inputs["input_ids"],
pixel_values=inputs["pixel_values"],
max_new_tokens=1024,
do_sample=False,
num_beams=3
)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
parsed_answer = processor.post_process_generation(generated_text, task="<DocVQA>", image_size=(image.width, image.height))
print(parsed_answer)
📚 詳細文檔
模型信息
屬性 |
詳情 |
模型類型 |
基於microsoft/Florence-2-base-ft微調的視覺問答模型 |
任務類型 |
視覺問答(Visual Question Answering) |
應用場景 |
肺癌檢測 |
評估指標
測試準確率:99.17%
開發者信息
- 開發者:Nirusanan
- 許可證:未提供
- 基礎模型:microsoft/Florence-2-base-ft