F

Florence 2 FT DocVQA

Developed by sahilnishad
A document visual question answering model fine-tuned based on Florence-2-base, specifically designed for handling QA tasks in document images.
Downloads 4,928
Release Time : 11/2/2024

Model Overview

This model is fine-tuned on the DocumentVQA dataset, capable of understanding document image content and answering related questions, suitable for various document analysis scenarios.

Model Features

Document Image Understanding
Capable of parsing and understanding content and structure in document images.
Question Answering Capability
Provides accurate question answering functionality for document content.
Multimodal Processing
Simultaneously processes visual and textual information for cross-modal understanding.

Model Capabilities

Document Image Analysis
Visual Question Answering
Text Extraction
Cross-modal Understanding

Use Cases

Document Processing
Contract Analysis
Extract key terms and conditions from contract documents
Invoice Processing
Identify amounts, dates, and supplier information in invoices
Education
Exam Paper Grading
Automatically grade student answer sheets and extract answers
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase