L

Layoutlmv3 Base Mpdocvqa

Developed by rubentito
This model is a document visual question answering model fine-tuned on the Multi-page Document VQA (MP-DocVQA) dataset, based on Microsoft's pre-trained LayoutLMv3 model.
Downloads 664
Release Time : 2/21/2023

Model Overview

This model is specifically designed for document visual question answering tasks, capable of handling QA requirements across multi-page documents by combining textual and visual information for answer prediction.

Model Features

Multimodal processing capability
Combines textual and visual information for document understanding, suitable for complex document visual QA tasks.
Multi-page document support
Capable of handling QA requirements across multi-page documents and predicting the page containing the answer.
Efficient performance
Achieves good document QA performance with a 125M parameter scale.

Model Capabilities

Document visual QA
Multi-page document processing
Text and visual information fusion

Use Cases

Document processing
Contract document QA
Extract specific clause information from multi-page contract documents
ANLS 0.4538, APPA 51.9426
Report document analysis
Analyze key data in multi-page report documents
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase