Florence 2 FT DocVQA
A document visual question answering model fine-tuned based on Florence-2-base, specifically designed for handling QA tasks in document images.
Downloads 4,928
Release Time : 11/2/2024
Model Overview
This model is fine-tuned on the DocumentVQA dataset, capable of understanding document image content and answering related questions, suitable for various document analysis scenarios.
Model Features
Document Image Understanding
Capable of parsing and understanding content and structure in document images.
Question Answering Capability
Provides accurate question answering functionality for document content.
Multimodal Processing
Simultaneously processes visual and textual information for cross-modal understanding.
Model Capabilities
Document Image Analysis
Visual Question Answering
Text Extraction
Cross-modal Understanding
Use Cases
Document Processing
Contract Analysis
Extract key terms and conditions from contract documents
Invoice Processing
Identify amounts, dates, and supplier information in invoices
Education
Exam Paper Grading
Automatically grade student answer sheets and extract answers
Featured Recommended AI Models
Š 2025AIbase