D

Detr Layout Detection

Developed by cmarkea
A document layout detection model based on the DETR architecture, capable of identifying various layout elements in documents.
Downloads 13.21k
Release Time : 7/29/2024

Model Overview

This model is fine-tuned on the DocLayNet dataset using the detr-resnet-50 model, capable of simultaneously predicting masks and bounding boxes for document objects, making it an ideal choice for processing document corpora to be imported into Open-Domain Question Answering (ODQA) systems.

Model Features

Multi-Class Detection
Can identify 11 types of document entities, including headings, footnotes, formulas, list items, etc.
Dual-Task Output
Simultaneously predicts masks and bounding boxes for document objects.
High Performance
Outstanding performance on the DocLayNet evaluation dataset, achieving an F1 score of 91.27.

Model Capabilities

Document Layout Analysis
Object Detection
Semantic Segmentation

Use Cases

Document Processing
Open-Domain QA System Preprocessing
Prepare document corpora for ODQA systems by identifying different layout elements.
Effectively separates elements such as text, images, and tables in documents.
Document Digitization
Convert scanned documents into structured digital formats.
Accurately identifies various document elements and their positional relationships.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase