TestDocumentQuestionAnswering Open-Source Document Visual Question Answering Model - Free Deployment for Accurate Answers to Document Questions

Home

Testdocumentquestionanswering

Developed by Dhineshk

A document visual question answering model based on the LayoutLMv2 architecture, fine-tuned for DocVQA tasks

Image-to-Text

Transformers

#Document Visual Question Answering #Multimodal Understanding #Layout Awareness

Downloads 16

Release Time : 9/27/2023

Model Overview

This model is a fine-tuned version of LayoutLMv2 base, specifically designed for Document Visual Question Answering (DocVQA) tasks, capable of understanding the relationship between document layout and text content

Model Features

Multimodal Understanding Capability

Combines textual content and visual layout information for document comprehension

Document Structure Awareness

Capable of recognizing structured elements in documents such as tables and paragraphs

Question Answering Ability

Answers user questions regarding document content

Model Capabilities

Document content understanding

Visual question answering

Document layout analysis

Text and visual information fusion processing

Use Cases

Document Processing

Contract Analysis

Automatically answers questions about contract terms

Table Data Extraction

Extracts specific information from structured documents

Education

Automatic Test Grading

Identifies student answer content and evaluates answer correctness

🚀 layoutlmv2-base-uncased_finetuned_docvqa

This model is a fine - tuned version of microsoft/layoutlmv2-base-uncased on an unknown dataset. It helps in specific tasks related to document question - answering. The evaluation set shows a loss of 5.3353, which can be used as a reference for its performance.

📚 Documentation

Model Information

Property	Details
Model Type	A fine - tuned version of microsoft/layoutlmv2 - base - uncased
License	CC - BY - NC - SA - 4.0

Training and Evaluation

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e - 05
train_batch_size: 4
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
lr_scheduler_type: linear
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss
0.153	0.22	50	5.3909
0.2793	0.44	100	5.0150
0.2634	0.66	150	4.6620
0.5192	0.88	200	4.7826
0.3096	1.11	250	4.9532
0.2638	1.33	300	5.2584
0.4727	1.55	350	4.0943
0.2763	1.77	400	4.8408
1.0425	1.99	450	5.0344
0.4477	2.21	500	4.9084
0.3266	2.43	550	5.0996
0.3085	2.65	600	4.4858
0.4648	2.88	650	4.0630
0.1845	3.1	700	5.3969
0.1616	3.32	750	4.8225
0.1752	3.54	800	5.2945
0.1877	3.76	850	5.2358
0.3172	3.98	900	5.2205
0.1627	4.2	950	4.9991
0.2548	4.42	1000	4.6917
0.1566	4.65	1050	5.1266
0.2616	4.87	1100	4.3241
0.1199	5.09	1150	4.9821
0.1372	5.31	1200	5.0838
0.1198	5.53	1250	5.0156
0.0558	5.75	1300	4.8638
0.1331	5.97	1350	4.9492
0.0689	6.19	1400	4.6926
0.0912	6.42	1450	4.5153
0.0495	6.64	1500	4.6969
0.0853	6.86	1550	4.7690
0.1072	7.08	1600	4.6783
0.034	7.3	1650	4.7351
0.2999	7.52	1700	4.5185
0.0763	7.74	1750	4.5825
0.0799	7.96	1800	4.7218
0.0343	8.19	1850	5.1508
0.0396	8.41	1900	5.4893
0.033	8.63	1950	5.5167
0.0295	8.85	2000	5.6252
0.2303	9.07	2050	4.7031
0.088	9.29	2100	4.7323
0.0666	9.51	2150	4.8688
0.0597	9.73	2200	5.6007
0.0615	9.96	2250	5.5403
0.1003	10.18	2300	5.3198
0.0457	10.4	2350	5.4828
0.0391	10.62	2400	5.5312
0.0325	10.84	2450	5.7410
0.0147	11.06	2500	5.8749
0.1013	11.28	2550	5.6522
0.001	11.5	2600	5.7776
0.0002	11.73	2650	5.8431
0.03	11.95	2700	5.9751
0.0452	12.17	2750	5.6928
0.0002	12.39	2800	5.6264
0.0109	12.61	2850	5.2688
0.0801	12.83	2900	5.2780
0.0216	13.05	2950	5.3691
0.0002	13.27	3000	5.5237
0.0092	13.5	3050	5.3662
0.0124	13.72	3100	5.4474
0.0515	13.94	3150	5.3623
0.0032	14.16	3200	5.4168
0.0051	14.38	3250	5.2897
0.0002	14.6	3300	5.3205
0.014	14.82	3350	5.2114
0.0004	15.04	3400	5.2342
0.0104	15.27	3450	5.2562
0.0107	15.49	3500	5.1112
0.0002	15.71	3550	5.1515
0.0002	15.93	3600	5.2054
0.0002	16.15	3650	5.1968
0.0003	16.37	3700	5.3196
0.0246	16.59	3750	5.3111
0.0054	16.81	3800	5.3335
0.0001	17.04	3850	5.3488
0.0243	17.26	3900	5.2597
0.0217	17.48	3950	5.2834
0.0002	17.7	4000	5.2947
0.0002	17.92	4050	5.3131
0.0001	18.14	4100	5.3240
0.0016	18.36	4150	5.3129
0.0133	18.58	4200	5.3241
0.0002	18.81	4250	5.3382
0.0159	19.03	4300	5.3764
0.003	19.25	4350	5.3776
0.0516	19.47	4400	5.3389
0.016	19.69	4450	5.3275
0.0105	19.91	4500	5.3353

Framework versions

Transformers 4.33.2
Pytorch 2.0.1+cpu
Datasets 2.14.5
Tokenizers 0.13.3

📄 License

This model is licensed under CC - BY - NC - SA - 4.0.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご