PP-Chart2Table Open-source Multimodal Model - Efficiently Convert Chinese and English Charts to Data Tables for Free

PP Chart2Table

Developed by PaddlePaddle

PP-Chart2Table is a multimodal model developed by the PaddlePaddle team, focusing on Chinese and English chart parsing and capable of efficiently converting charts into data tables.

Image-to-Text Supports Multiple LanguagesOpen Source License:Apache-2.0 #Chart parsing #Multimodal model #Data table conversion

Downloads 1,392

Release Time : 6/5/2025

Model Overview

PP-Chart2Table is an advanced multimodal model that significantly improves the efficiency of chart parsing through novel training tasks and a fine-grained token masking strategy. It supports Chinese and English chart conversion and has strong generalization ability.

Model Features

Efficient chart parsing

Significantly improve the efficiency of chart parsing through the Shuffled Chart Data Retrieval training task and token masking strategy.

Multimodal capabilities

Combine visual and language modalities to support Chinese and English chart parsing.

Strong generalization ability

Ensure the adaptability and generalization ability of the model on large-scale unlabeled and out-of-distribution data through a two-stage distillation process.

High-quality training data

Create a rich and diverse training set using high-quality seed data, RAG, and large language model role design.

Model Capabilities

Chart parsing

Table conversion

Chinese and English support

Multimodal processing

Use Cases

Document processing

Chart data extraction

Extract chart data from document images and convert it into a structured table.

Efficiently and accurately extract chart data, supporting multiple chart types.

Data analysis

Data visualization parsing

Parse data visualization charts and extract the original data for further analysis.

Provide accurate original data to support the data analysis process.

🚀 PP-Chart2Table

PP-Chart2Table is a state-of-the-art multimodal model developed by the PaddlePaddle team. It specializes in chart parsing for both Chinese and English. With a novel "Shuffled Chart Data Retrieval" training task and a refined token masking strategy, it can efficiently convert charts to data tables. An advanced data synthesis pipeline, using high - quality seed data, RAG, and LLMs persona design, enriches the training set. A two - stage distillation process is implemented to handle large - scale unlabeled, out - of - distribution (OOD) data, ensuring robust adaptability and generalization on real - world data. In - house benchmarks show that PP-Chart2Table outperforms similar - scale models and matches the performance of 7 - billion parameter Vision Language Models (VLMs) in critical scenarios.

Chart

🚀 Quick Start

📦 Installation

1. PaddlePaddle

Install PaddlePaddle using pip with the following commands:

# for CUDA11.8
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/

# for CUDA12.6
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/

# for CPU
python -m pip install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/

For more details on PaddlePaddle installation, refer to the PaddlePaddle official website.

2. PaddleX

Install the latest version of the PaddleX inference package from PyPI:

python -m pip install paddlex && python -m pip install "paddlex[multimodal]"

💻 Usage Examples

Basic Usage

Integrate the model inference of PP-Chart2Table into your project. Download the sample image to your local machine before running the following code:

from paddlex import create_model
model = create_model('PP-Chart2Table')
results = model.predict(
    input={"image": "https://cdn-uploads.huggingface.co/production/uploads/684acf07de103b2d44c85531/OrlFuIXQUhO3Fg1G9_H1u.png"},
    batch_size=1
)
for res in results:
    res.print()
    res.save_to_json(f"./output/res.json")

After running, the obtained result is as follows:

{'res': {'image': 'https://cdn-uploads.huggingface.co/production/uploads/684acf07de103b2d44c85531/OrlFuIXQUhO3Fg1G9_H1u.png', 'result': 'Agency | Favorable | Not Sure | Unfavorable\nNational Park Service | 81% | 12% | 7%\nU.S. Postal Service | 77% | 3% | 20%\nNASA | 74% | 17% | 9%\nSocial Security Administration | 61% | 12% | 28%\nCDC | 56% | 6% | 38%\nVeterans Affairs | 56% | 16% | 28%\nEPA | 55% | 14% | 31%\nHealth and Human Services | 55% | 15% | 30%\nFBI | 52% | 12% | 36%\nDepartment of Transportation | 52% | 12% | 36%\nDepartment of Homeland Security | 51% | 18% | 35%\nDepartment of Justice | 49% | 10% | 41%\nCIA | 46% | 21% | 33%\nDepartment of Education | 45% | 8% | 47%\nFederal Reserve | 43% | 20% | 37%\nIRS | 42% | 7% | 51%'}}

The visualized result is as follows:

For details about usage command and descriptions of parameters, refer to the Document.

Advanced Usage - Pipeline

The ability of a single model is limited. A pipeline consisting of several models can handle difficult real - world problems better.

PP-StructureV3

Layout analysis extracts structured information from document images. PP-StructureV3 includes seven modules:

Layout Detection Module
Chart Recognition Module（Optional）
General OCR Sub - pipeline
Document Image Preprocessing Sub - pipeline （Optional）
Table Recognition Sub - pipeline （Optional）
Seal Recognition Sub - pipeline （Optional）
Formula Recognition Sub - pipeline （Optional）

You can quickly experience the PP-StructureV3 pipeline with a single command:

paddleocr pp_structurev3 --chart_recognition_model_name PP-Chart2Table \
    --use_chart_recognition True \
    -i https://cdn-uploads.huggingface.co/production/uploads/684acf07de103b2d44c85531/Mk1PKgszCEEutZukT3FPB.png

You can also use a few lines of code to experience the pipeline inference. Taking the PP-StructureV3 pipeline as an example:

from paddleocr import PPStructureV3

pipeline = PPStructureV3(chart_recognition_model_name="PP-Chart2Table", use_chart_recognition=True)
# ocr = PPStructureV3(use_doc_orientation_classify=True) # Use use_doc_orientation_classify to enable/disable document orientation classification model
# ocr = PPStructureV3(use_doc_unwarping=True) # Use use_doc_unwarping to enable/disable document unwarping module
# ocr = PPStructureV3(use_textline_orientation=True) # Use use_textline_orientation to enable/disable textline orientation classification model
# ocr = PPStructureV3(device="gpu") # Use device to specify GPU for model inference
output = pipeline.predict("./Mk1PKgszCEEutZukT3FPB.png", use_chart_recognition=True)
for res in output:
    res.print() ## Print the structured prediction output
    res.save_to_json(save_path="output") ## Save the current image's structured result in JSON format
    res.save_to_markdown(save_path="output") ## Save the current image's result in Markdown format

The default model used in the pipeline is PP-Chart2Table, so you don't have to specify PP-Chart2Table for the chart_recognition_model_name argument. You can use the local model file by the chart_recognition_model_dir argument. For details about usage command and descriptions of parameters, refer to the Document.

📚 Documentation

📄 License

This project is licensed under the Apache - 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご