L

Lilt Xlm Roberta Base Finetuned With DocLayNet Base At Paragraphlevel Ml512

Developed by pierreguillou
This is a document understanding model specifically designed for analyzing document layout and content, performing token classification tasks at the paragraph level.
Downloads 126
Release Time : 2/15/2023

Model Overview

This model is based on the LiLT architecture, fine-tuned on the DocLayNet base dataset at the paragraph level, capable of identifying different paragraph types in documents (such as headings, text, tables, etc.).

Model Features

Multilingual Support
The model supports understanding and analyzing documents in multiple languages.
Paragraph-level Analysis
Capable of identifying functional types of different paragraphs in documents.
High-precision Classification
Achieves an F1 score of 86.34% on the test set.

Model Capabilities

Document Layout Analysis
Paragraph Type Recognition
Multilingual Document Processing
Token Classification

Use Cases

Document Processing
Financial Report Analysis
Automatically identifies different sections in financial reports (headings, body text, tables, etc.)
Accuracy 86.34%
Scientific Paper Processing
Classifies formulas, charts, and body content in scientific papers
Formula recognition accuracy 97.33%
Legal Document Processing
Legal Text Parsing
Identifies section headings and body content in legal documents
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase