D

Docfusion

Developed by sc22mc
DocFusion is a unified framework for document parsing tasks, aiming to solve the problems of system complexity and limited performance in existing document parsing methods and providing a more efficient and unified solution for document parsing.
Downloads 107
Release Time : 1/28/2025

Model Overview

DocFusion is a unified document parsing framework that can simultaneously handle layout element detection and recognition tasks in document parsing, featuring lightweight and high performance.

Model Features

Unified processing ability
Proposed Gaussian-Kernel CrossEntropy Loss (GK-CEL) to enable the generative framework to simultaneously handle layout element detection and recognition tasks in document parsing.
Lightweight model
As a unified document parsing model, DocFusion only has 0.28B parameters.
High-quality dataset
Built the DocLatex-1.6M dataset to provide high-quality data support for model training.
Excellent performance
Performed excellently in four core document parsing tasks, verifying the effectiveness of the unified method.

Model Capabilities

Document layout element detection
Document element recognition
Unified document parsing

Use Cases

Document processing
Document layout analysis
Automatically detect layout elements in the document, such as titles, paragraphs, tables, etc.
Performed excellently in four core document parsing tasks
Document element recognition
Recognize the specific element content in the document, such as text, images, formulas, etc.
High-quality data support improves the recognition accuracy
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase