SLANet_plus開源表格結構識別模型 - 快速將表格圖像轉可編輯HTML格式

首頁

Slanet Plus

由PaddlePaddle開發

SLANet_plus是一款用於表格結構識別的模型，能夠將不可編輯的表格圖像轉換為可編輯的表格格式（如HTML），在表格識別系統中發揮著重要作用，可有效提升表格識別的準確性和效率。

文字識別支持多種語言開源協議:Apache-2.0 #表格結構識別 #HTML轉換 #多模塊集成

下載量 1,121

發布時間 : 6/6/2025

模型概述

SLANet_plus是一款專注於表格結構識別的深度學習模型，能夠準確識別表格中的行、列和單元格位置，將非編輯的表格圖像轉換為可編輯的HTML格式。該模型在表格識別系統中提供關鍵支持，可集成到多種文檔處理流程中。

模型特點

高精度表格結構識別

能夠準確識別表格中的行、列和單元格位置，將非編輯的表格圖像轉換為可編輯的HTML格式

多模塊集成管道

提供通用表格識別V2管道和PP-StructureV3管道，集成了表格分類、結構識別、文本檢測與識別等多個模塊

高效推理

模型存儲大小僅為6.9M，在GPU和CPU上都有較好的推理速度，GPU推理時間約140ms

端到端解決方案

支持從圖像輸入到結構化輸出的完整流程，可輸出HTML、Excel等多種格式

模型能力

表格結構識別

表格圖像轉換

HTML格式輸出

Excel格式輸出

多模塊協同處理

使用案例

文檔處理

財務報表識別

將掃描的財務報表圖像轉換為可編輯的HTML或Excel格式

準確識別表格結構，保留原始數據關係

報銷單據處理

自動識別報銷單據中的表格信息並結構化輸出

識別準確率63.69%，可大幅減少人工錄入工作

數據數字化

歷史檔案數字化

將紙質檔案中的表格內容轉換為可編輯的數字格式

保留原始表格結構，便於後續數據分析和處理

🚀 SLANet_plus

🚀 快速開始

安裝依賴

1. 安裝PaddlePaddle

請參考以下命令，使用pip安裝PaddlePaddle：

# 適用於CUDA11.8
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/

# 適用於CUDA12.6
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/

# 適用於CPU
python -m pip install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/

PaddlePaddle安裝詳情請參考PaddlePaddle官方網站。

2. 安裝PaddleOCR

從PyPI安裝最新版本的PaddleOCR推理包：

python -m pip install paddleocr

模型使用

單命令體驗功能

你可以通過單命令快速體驗功能：

paddleocr table_structure_recognition \
    --model_name SLANet_plus \
    -i https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/6rfhb-CXOHowonjpBsaUJ.png

集成到項目中

你也可以將表格分類模塊的模型推理集成到你的項目中。在運行以下代碼之前，請將示例圖像下載到本地。

from paddleocr import TableStructureRecognition
model = TableStructureRecognition(model_name="SLANet_plus")
output = model.predict(input="UHf7jONQ3a18cszdL_Wuo.png", batch_size=1)
for res in output:
    res.print(json_format=False)
    res.save_to_json("./output/res.json")

運行後，得到的結果如下：

{'res': {'input_path': '6rfhb-CXOHowonjpBsaUJ.png', 'page_index': None, 'bbox': [[1, 2, 64, 2, 64, 41, 1, 41], [52, 1, 199, 1, 198, 38, 51, 38], [182, 1, 253, 1, 254, 40, 184, 40], [248, 1, 323, 1, 324, 41, 249, 41], [314, 1, 384, 1, 385, 40, 315, 40], [389, 2, 493, 2, 493, 45, 388, 44], [2, 42, 50, 42, 50, 77, 2, 77], [65, 42, 176, 42, 175, 77, 64, 77], [187, 40, 251, 40, 249, 79, 185, 79], [252, 41, 319, 41, 319, 80, 251, 80], [318, 40, 379, 40, 380, 78, 318, 78], [385, 39, 497, 39, 497, 84, 384, 83], [2, 82, 50, 82, 50, 118, 2, 118], [63, 80, 182, 80, 181, 114, 62, 114], [189, 80, 250, 80, 249, 114, 187, 114], [253, 80, 319, 80, 319, 114, 252, 114], [320, 78, 378, 79, 378, 114, 320, 114], [395, 77, 496, 78, 496, 118, 394, 118], [2, 117, 49, 118, 50, 155, 2, 155], [65, 115, 180, 115, 179, 151, 64, 151], [191, 115, 249, 115, 248, 150, 189, 150], [254, 115, 318, 115, 318, 150, 253, 150], [321, 114, 377, 114, 378, 150, 321, 150], [396, 113, 495, 113, 495, 154, 394, 153], [1, 153, 56, 153, 57, 192, 1, 191], [68, 152, 175, 152, 175, 189, 67, 189], [189, 152, 249, 152, 249, 188, 188, 188], [252, 152, 317, 152, 318, 188, 252, 188], [320, 150, 377, 151, 378, 188, 321, 187], [393, 150, 494, 151, 494, 193, 391, 192]], 'structure': ['<html>', '<body>', '<table>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '<tr>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '<td></td>', '</tr>', '</table>', '</body>', '</html>'], 'structure_score': 0.99635947}}

使用命令和參數說明詳情請參考文檔。

管道使用

單個模型的能力有限，但由多個模型組成的管道可以提供更強的能力，以解決現實場景中的難題。

通用表格識別V2管道

通用表格識別V2管道用於解決表格識別任務，通過從圖像中提取信息並以HTML或Excel格式輸出。管道中有8個模塊：

表格分類模塊
表格結構識別模塊
表格單元格檢測模塊
文本檢測模塊
文本識別模塊
佈局區域檢測模塊（可選）
文檔圖像方向分類模塊（可選）
文本圖像去畸變模塊（可選）

運行單命令，使用默認配置快速體驗通用表格識別V2管道，該管道使用SLANeXt_wired和SLANeXt_wireless預測表格結構：

paddleocr table_recognition_v2 -i https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/mabagznApI1k9R8qFoTLc.png  \
    --use_doc_orientation_classify False  \
    --use_doc_unwarping False \
    --save_path ./output \
    --device gpu:0

結果會打印到終端：

{'res': {'input_path': 'mabagznApI1k9R8qFoTLc.png', 'page_index': None, 'model_settings': {'use_doc_preprocessor': False, 'use_layout_detection': True, 'use_ocr_model': True}, 'layout_det_res': {'input_path': None, 'page_index': None, 'boxes': [{'cls_id': 8, 'label': 'table', 'score': 0.86655592918396, 'coordinate': [0.0125130415, 0.41920784, 1281.3737, 585.3884]}]}, 'overall_ocr_res': {'input_path': None, 'page_index': None, 'model_settings': {'use_doc_preprocessor': False, 'use_textline_orientation': False}, 'dt_polys': array([[[   9,   21],
        ...,
        [   9,   59]],

       ...,

       [[1046,  536],
        ...,
        [1046,  573]]], dtype=int16), 'text_det_params': {'limit_side_len': 960, 'limit_type': 'max', 'thresh': 0.3, 'box_thresh': 0.6, 'unclip_ratio': 2.0}, 'text_type': 'general', 'textline_orientation_angles': array([-1, ..., -1]), 'text_rec_score_thresh': 0, 'rec_texts': ['部門', '報銷人', '報銷事由', '批准人：', '單據', '張', '合計金額', '元', '車費票', '其', '火車費票', '飛機票', '中', '旅住宿費', '其他', '補貼'], 'rec_scores': array([0.99958128, ..., 0.99317062]), 'rec_polys': array([[[   9,   21],
        ...,
        [   9,   59]],

       ...,

       [[1046,  536],
        ...,
        [1046,  573]]], dtype=int16), 'rec_boxes': array([[   9, ...,   59],
       ...,
       [1046, ...,  573]], dtype=int16)}, 'table_res_list': [{'cell_box_list': [array([ 0.13052222, ..., 73.08310249]), array([104.43082511, ...,  73.27777413]), array([319.39041221, ...,  73.30439308]), array([424.2436837 , ...,  73.44736794]), array([580.75836265, ...,  73.24003914]), array([723.04370201, ...,  73.22717598]), array([984.67315757, ...,  73.20420387]), array([1.25130415e-02, ..., 5.85419208e+02]), array([984.37072837, ..., 137.02281502]), array([984.26586998, ..., 201.22290352]), array([984.24017417, ..., 585.30775765]), array([1039.90606773, ...,  265.44664314]), array([1039.69549644, ...,  329.30540779]), array([1039.66546714, ...,  393.57319954]), array([1039.5122689 , ...,  457.74644783]), array([1039.55535972, ...,  521.73030403]), array([1039.58612144, ...,  585.09468392])], 'pred_html': '<html><body><table><tbody><tr><td>部門</td><td></td><td>報銷人</td><td></td><td>報銷事由</td><td></td><td colspan="2">批准人：</td></tr><tr><td colspan="6" rowspan="8"></td><td colspan="2">單據 張</td></tr><tr><td colspan="2">合計金額 元</td></tr><tr><td rowspan="6">其 中</td><td>車費票</td></tr><tr><td>火車費票</td></tr><tr><td>飛機票</td></tr><tr><td>旅住宿費</td></tr><tr><td>其他</td></tr><tr><td>補貼</td></tr></tbody></table></body></html>', 'table_ocr_pred': {'rec_polys': array([[[   9,   21],
        ...,
        [   9,   59]],

       ...,

       [[1046,  536],
        ...,
        [1046,  573]]], dtype=int16), 'rec_texts': ['部門', '報銷人', '報銷事由', '批准人：', '單據', '張', '合計金額', '元', '車費票', '其', '火車費票', '飛機票', '中', '旅住宿費', '其他', '補貼'], 'rec_scores': array([0.99958128, ..., 0.99317062]), 'rec_boxes': array([[   9, ...,   59],
       ...,
       [1046, ...,  573]], dtype=int16)}}]}}

如果指定了save_path，可視化結果將保存到save_path下。可視化輸出如下：

image/jpeg

命令行方法用於快速體驗。對於項目集成，也只需要幾行代碼：

from paddleocr import TableRecognitionPipelineV2

pipeline = TableRecognitionPipelineV2(
    use_doc_orientation_classify=False, # 使用use_doc_orientation_classify啟用/禁用文檔方向分類模型
    use_doc_unwarping=False, # 使用use_doc_unwarping啟用/禁用文檔去畸變模塊
)
# pipeline = TableRecognitionPipelineV2(use_doc_orientation_classify=True) # 使用use_doc_orientation_classify指定是否使用文檔方向分類模型
# pipeline = TableRecognitionPipelineV2(use_doc_unwarping=True) # 使用use_doc_unwarping指定是否使用文本圖像去畸變模塊
# pipeline = TableRecognitionPipelineV2(device="gpu") # 使用device指定使用GPU進行模型推理
output = pipeline.predict("https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/mabagznApI1k9R8qFoTLc.png")
for res in output:
    res.print() ## 打印預測的結構化輸出
    res.save_to_img("./output/")
    res.save_to_xlsx("./output/")
    res.save_to_html("./output/")
    res.save_to_json("./output/")

如果你想使用SLANet_plus模型進行表格識別，只需更改模型名稱並使用端到端預測模式，如下所示：

paddleocr table_recognition_v2 -i https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/mabagznApI1k9R8qFoTLc.png  \
    --use_doc_orientation_classify False  \
    --use_doc_unwarping False \
    --wired_table_structure_recognition_model_name SLANet_plus \ 
    --use_e2e_wired_table_rec_model True \
    --wireless_table_structure_recognition_model_name SLANet_plus \
    --use_e2e_wireless_table_rec_model True \
    --save_path ./output \
    --device gpu:0

from paddleocr import TableRecognitionPipelineV2

pipeline = TableRecognitionPipelineV2(
    use_doc_orientation_classify=False,
    use_doc_unwarping=False, 
    wired_table_structure_recognition_model_name=SLANet_plus,  ## 用於有線表格識別
    wireless_table_structure_recognition_model_name=SLANet_plus,  ## 用於無線表格識別
)
output = pipeline.predict(
    "https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/mabagznApI1k9R8qFoTLc.png",
    use_e2e_wired_table_rec_model=True,  ## 用於有線表格識別
    use_e2e_wireless_table_rec_model=True,  ## 用於無線表格識別
    )
for res in output:
    res.print() ## 打印預測的結構化輸出
    res.save_to_img("./output/")
    res.save_to_xlsx("./output/")
    res.save_to_html("./output/")
    res.save_to_json("./output/")

使用命令和參數說明詳情請參考文檔。

PP-StructureV3

佈局分析是一種從文檔圖像中提取結構化信息的技術。PP-StructureV3包括以下六個模塊：

佈局檢測模塊
通用OCR管道
文檔圖像預處理管道（可選）
表格識別管道（可選）
印章識別管道（可選）
公式識別管道（可選）

運行單命令快速體驗PP-StructureV3管道：

paddleocr pp_structurev3 -i https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/mG4tnwfrvECoFMu-S9mxo.png \
    --use_doc_orientation_classify False \
    --use_doc_unwarping False \
    --wired_table_structure_recognition_model_name SLANet_plus \ 
    --use_e2e_wired_table_rec_model True \
    --wireless_table_structure_recognition_model_name SLANet_plus \
    --use_e2e_wireless_table_rec_model True \
    --use_textline_orientation False \
    --device gpu:0

結果將打印到終端。如果指定了save_path，結果將保存到save_path下。

只需幾行代碼即可體驗管道推理。以PP-StructureV3管道為例：

from paddleocr import PPStructureV3

pipeline = PPStructureV3(
    wired_table_structure_recognition_model_name=SLANet_plus,  ## 用於有線表格識別
    wireless_table_structure_recognition_model_name=SLANet_plus,  ## 用於無線表格識別
    use_doc_orientation_classify=False, # 使用use_doc_orientation_classify啟用/禁用文檔方向分類模型
    use_doc_unwarping=False,    # 使用use_doc_unwarping啟用/禁用文檔去畸變模塊
    use_textline_orientation=False, # 使用use_textline_orientation啟用/禁用文本行方向分類模型
    device="gpu:0", # 使用device指定使用GPU進行模型推理
    )
output = pipeline.predict(
    "mG4tnwfrvECoFMu-S9mxo.png",
    use_e2e_wired_table_rec_model=True,  ## 用於有線表格識別
    use_e2e_wireless_table_rec_model=True,  ## 用於無線表格識別
    )
for res in output:
    res.print() # 打印結構化預測輸出
    res.save_to_json(save_path="output") ## 以JSON格式保存當前圖像的結構化結果
    res.save_to_markdown(save_path="output") ## 以Markdown格式保存當前圖像的結果

管道中默認使用的模型是SLANeXt_wired和SLANeXt_wireless，因此需要通過參數指定為SLANet_plus。使用命令和參數說明詳情請參考文檔。

✨ 主要特性

表格結構識別能力

能夠準確識別表格中的行、列和單元格位置，將非編輯的表格圖像轉換為可編輯的HTML格式，為表格識別系統提供關鍵支持。

多模塊集成管道

提供通用表格識別V2管道和PP-StructureV3管道，集成了表格分類、結構識別、文本檢測與識別等多個模塊，可解決複雜的表格識別任務。

高效推理

模型存儲大小僅為6.9M，在GPU和CPU上都有較好的推理速度，能夠滿足不同場景下的使用需求。

📦 安裝指南

安裝PaddlePaddle

# 適用於CUDA11.8
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/

# 適用於CUDA12.6
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/

# 適用於CPU
python -m pip install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/

安裝PaddleOCR

python -m pip install paddleocr

💻 使用示例

基礎用法

單命令體驗模型功能

paddleocr table_structure_recognition \
    --model_name SLANet_plus \
    -i https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/6rfhb-CXOHowonjpBsaUJ.png

集成到項目中

from paddleocr import TableStructureRecognition
model = TableStructureRecognition(model_name="SLANet_plus")
output = model.predict(input="UHf7jONQ3a18cszdL_Wuo.png", batch_size=1)
for res in output:
    res.print(json_format=False)
    res.save_to_json("./output/res.json")

高級用法

使用通用表格識別V2管道

paddleocr table_recognition_v2 -i https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/mabagznApI1k9R8qFoTLc.png  \
    --use_doc_orientation_classify False  \
    --use_doc_unwarping False \
    --save_path ./output \
    --device gpu:0

from paddleocr import TableRecognitionPipelineV2

pipeline = TableRecognitionPipelineV2(
    use_doc_orientation_classify=False,
    use_doc_unwarping=False, 
    wired_table_structure_recognition_model_name=SLANet_plus,
    wireless_table_structure_recognition_model_name=SLANet_plus,
)
output = pipeline.predict(
    "https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/mabagznApI1k9R8qFoTLc.png",
    use_e2e_wired_table_rec_model=True,
    use_e2e_wireless_table_rec_model=True,
    )
for res in output:
    res.print()
    res.save_to_img("./output/")
    res.save_to_xlsx("./output/")
    res.save_to_html("./output/")
    res.save_to_json("./output/")

使用PP-StructureV3管道

paddleocr pp_structurev3 -i https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/mG4tnwfrvECoFMu-S9mxo.png \
    --use_doc_orientation_classify False \
    --use_doc_unwarping False \
    --wired_table_structure_recognition_model_name SLANet_plus \ 
    --use_e2e_wired_table_rec_model True \
    --wireless_table_structure_recognition_model_name SLANet_plus \
    --use_e2e_wireless_table_rec_model True \
    --use_textline_orientation False \
    --device gpu:0

from paddleocr import PPStructureV3

pipeline = PPStructureV3(
    wired_table_structure_recognition_model_name=SLANet_plus,
    wireless_table_structure_recognition_model_name=SLANet_plus,
    use_doc_orientation_classify=False,
    use_doc_unwarping=False,
    use_textline_orientation=False,
    device="gpu:0",
    )
output = pipeline.predict(
    "mG4tnwfrvECoFMu-S9mxo.png",
    use_e2e_wired_table_rec_model=True,
    use_e2e_wireless_table_rec_model=True,
    )
for res in output:
    res.print()
    res.save_to_json(save_path="output")
    res.save_to_markdown(save_path="output")