🚀 医学视觉问答模型 - Florence-2_FT_Lung-Cancer-detection
基于microsoft/Florence-2-base-ft微调的肺癌检测模型,利用肺部图像精准识别肺癌类型,为医学诊断提供高效解决方案。
🚀 快速开始
安装依赖
! pip install -q "flash_attn==2.6.3" "timm==1.0.8" "einops==0.8.0" "transformers==4.44.0"
环境配置
device = "cuda:0" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
加载模型和处理器
model = AutoModelForCausalLM.from_pretrained("nirusanan/Florence-2_FT_Lung-Cancer-detection", torch_dtype=torch_dtype, trust_remote_code=True).to(device)
processor = AutoProcessor.from_pretrained("nirusanan/Florence-2_FT_Lung-Cancer-detection", trust_remote_code=True)
运行示例
prompt = "<DocVQA>" + "What is the type of lung cancer?"
url = "https://www.uab.edu/news/images/ct_scan.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = processor(text=prompt, images=image, return_tensors="pt").to(device, torch_dtype)
generated_ids = model.generate(
input_ids=inputs["input_ids"],
pixel_values=inputs["pixel_values"],
max_new_tokens=1024,
do_sample=False,
num_beams=3
)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
parsed_answer = processor.post_process_generation(generated_text, task="<DocVQA>", image_size=(image.width, image.height))
print(parsed_answer)
✨ 主要特性
- 微调模型:该模型是microsoft/Florence-2-base-ft的微调版本,专门针对肺癌检测任务进行优化。
- 视觉问答:支持视觉问答任务,可根据肺部图像回答相关问题。
📦 安装指南
运行以下命令安装所需依赖:
! pip install -q "flash_attn==2.6.3" "timm==1.0.8" "einops==0.8.0" "transformers==4.44.0"
💻 使用示例
基础用法
! pip install -q "flash_attn==2.6.3" "timm==1.0.8" "einops==0.8.0" "transformers==4.44.0"
device = "cuda:0" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
model = AutoModelForCausalLM.from_pretrained("nirusanan/Florence-2_FT_Lung-Cancer-detection", torch_dtype=torch_dtype, trust_remote_code=True).to(device)
processor = AutoProcessor.from_pretrained("nirusanan/Florence-2_FT_Lung-Cancer-detection", trust_remote_code=True)
prompt = "<DocVQA>" + "What is the type of lung cancer?"
url = "https://www.uab.edu/news/images/ct_scan.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = processor(text=prompt, images=image, return_tensors="pt").to(device, torch_dtype)
generated_ids = model.generate(
input_ids=inputs["input_ids"],
pixel_values=inputs["pixel_values"],
max_new_tokens=1024,
do_sample=False,
num_beams=3
)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
parsed_answer = processor.post_process_generation(generated_text, task="<DocVQA>", image_size=(image.width, image.height))
print(parsed_answer)
📚 详细文档
模型信息
属性 |
详情 |
模型类型 |
基于microsoft/Florence-2-base-ft微调的视觉问答模型 |
任务类型 |
视觉问答(Visual Question Answering) |
应用场景 |
肺癌检测 |
评估指标
测试准确率:99.17%
开发者信息
- 开发者:Nirusanan
- 许可证:未提供
- 基础模型:microsoft/Florence-2-base-ft