Florence-2-DocVQA Open-Source Model - Free Deployment to Boost Image-Text Understanding Tasks

Florence 2 DocVQA

Developed by impactframes

A version fine-tuned for 1 day using the Docmatix dataset (5% data volume) based on Microsoft's Florence-2 model, suitable for image-text understanding tasks

Text-to-Image

Transformers

#Document Image Understanding #Few-shot Fine-tuning #Multimodal Processing

Downloads 30

Release Time : 10/4/2024

Model Overview

This model is a fine-tuned version of Florence-2-large-ft, focusing on joint understanding tasks of images and text, enhancing performance through domain-specific data

Model Features

Domain-Adaptive Fine-tuning

Targeted fine-tuning using the Docmatix dataset to improve performance in specific domains

Multimodal Understanding

Capable of processing both image and text inputs to achieve cross-modal understanding

Model Capabilities

Image-text understanding

Cross-modal reasoning

Visual question answering

Use Cases

Document Understanding

Document Image Parsing

Extract structured information from scanned document images

Educational Technology

Textbook Content Analysis

Analyze the content of textbooks, including images and text, and generate summaries

🚀 Model Card for Florence-2 Model

This is Microsoft's Florence - 2 model. It was trained for 1 day with Docmatix (5% of the data) at a learning rate of 1e - 6. The fine - tuning code can be found at [GitHub Repository](https://github.com/andimarafioti/florence2 - finetuning), and a blog explaining how to fine - tune Florence is available at [Hugging Face Blog](https://huggingface.co/blog/finetune - florence2).

🚀 Quick Start

Use the code below to get started with the model. [More Information Needed]

✨ Features

This is the model card of a 🤗 transformers model that has been pushed on the Hub. It has been automatically generated.

📚 Documentation

Model Details

Model Description

Developed by: Andi Marafioti
Funded by [optional]: Hugging Face 🤗
Language(s) (NLP): English
License: MIT
Finetuned from model: [Florence - 2 - large - ft](https://huggingface.co/microsoft/Florence - 2 - large - ft)

Model Sources [optional]

Repository: [More Information Needed]
Demo [optional]: [More Information Needed]

Uses

Direct Use

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Out - of - Scope Use

[More Information Needed]

Bias, Risks, and Limitations

[More Information Needed]

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: [More Information Needed]
Hours used: [More Information Needed]
Cloud Provider: [More Information Needed]
Compute Region: [More Information Needed]
Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

📄 License

This model is licensed under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご