F

Florence 2 DocVQA

Developed by impactframes
A version fine-tuned for 1 day using the Docmatix dataset (5% data volume) based on Microsoft's Florence-2 model, suitable for image-text understanding tasks
Downloads 30
Release Time : 10/4/2024

Model Overview

This model is a fine-tuned version of Florence-2-large-ft, focusing on joint understanding tasks of images and text, enhancing performance through domain-specific data

Model Features

Domain-Adaptive Fine-tuning
Targeted fine-tuning using the Docmatix dataset to improve performance in specific domains
Multimodal Understanding
Capable of processing both image and text inputs to achieve cross-modal understanding

Model Capabilities

Image-text understanding
Cross-modal reasoning
Visual question answering

Use Cases

Document Understanding
Document Image Parsing
Extract structured information from scanned document images
Educational Technology
Textbook Content Analysis
Analyze the content of textbooks, including images and text, and generate summaries
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase