D

Dit Base Layout Detection

Developed by cmarkea
Document image layout detection model fine-tuned based on microsoft/dit-base, capable of recognizing 11 types of document elements
Downloads 704
Release Time : 7/18/2024

Model Overview

This model can extract different layout elements (such as text, images, headings, footnotes, etc.) from document images, making it particularly suitable for processing document collections that need to be imported into Open-domain Question Answering (ODQA) systems.

Model Features

Multi-category Document Element Recognition
Capable of recognizing 11 types of document elements, including image captions, footnotes, formulas, list items, headers and footers, etc.
Fine-tuned on DocLayNet
Fine-tuned on the DocLayNet dataset, specifically optimized for document layout analysis tasks
Dual Evaluation Metrics
Supports both semantic segmentation and object detection evaluation methods, providing comprehensive performance assessment

Model Capabilities

Document Image Analysis
Layout Element Recognition
Semantic Segmentation
Object Detection

Use Cases

Document Processing
Open-domain QA System Document Preprocessing
Automatically identifies and classifies different elements in documents when preparing them for ODQA systems
Improves document structuring and enhances the comprehension capabilities of QA systems
Document Digitization
Automatically identifies various region types when converting scanned documents into structured digital formats
Enhances the efficiency and accuracy of document digitization
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase