PP-LCNet_x1_0_doc_ori open-source model - Accurately distinguish and correct the orientation of document images, improving the accuracy of OCR

PP LCNet X1 0 Doc Ori

Developed by PaddlePaddle

Document image orientation classification module, used to distinguish the orientation of document images and correct them through post-processing to improve the accuracy of OCR processing.

Image Classification Supports Multiple LanguagesOpen Source License:Apache-2.0 #Document orientation classification #OCR preprocessing #High-precision classification

Downloads 9,506

Release Time : 6/6/2025

Model Overview

This model is mainly used to identify the orientation (0°, 90°, 180°, 270°) of document images and automatically correct the orientation in scenarios such as document scanning or ID card photo shooting to improve the accuracy of OCR processing.

Model Features

High accuracy

The average accuracy of the model in the document image orientation classification task reaches 99.06%.

Lightweight

The model storage size is only 7M, suitable for deployment in resource-constrained environments.

Easy to integrate

Supports quick integration into the existing OCR process through PaddleOCR, providing a convenient API call method.

Model Capabilities

Document image orientation classification

Image orientation correction

OCR preprocessing

Use Cases

Document processing

Document scanning orientation correction

Automatically identify and correct the image orientation during the document scanning process to ensure the accuracy of subsequent OCR processing.

The accuracy of the corrected image orientation is as high as 99.06%.

ID card photo orientation recognition

Automatically identify the orientation of ID card photos and correct them to facilitate subsequent information extraction.

Improve the accuracy of ID card OCR recognition.

🚀 PP-LCNet_x1_0_doc_ori

The Document Image Orientation Classification Module that enhances OCR accuracy by pre - determining and adjusting document orientations.

🚀 Quick Start

📦 Installation

PaddlePaddle

Please refer to the following commands to install PaddlePaddle using pip:

# for CUDA11.8
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/

# for CUDA12.6
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/

# for CPU
python -m pip install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/

For details about PaddlePaddle installation, please refer to the PaddlePaddle official website.

PaddleOCR

Install the latest version of the PaddleOCR inference package from PyPI:

python -m pip install paddleocr

💻 Usage Examples

Basic Usage

You can quickly experience the functionality with a single command:

paddleocr doc_img_orientation_classification \
    --model_name PP-LCNet_x1_0_doc_ori \
    -i https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/4ifXaBJmFByG_mAnF86Vv.png

You can also integrate the model inference of the text recognition module into your project. Before running the following code, please download the sample image to your local machine.

from paddleocr import DocImgOrientationClassification
model = DocImgOrientationClassification(model_name="PP-LCNet_x1_0_doc_ori")
output = model.predict(input="4ifXaBJmFByG_mAnF86Vv.png", batch_size=1)
for res in output:
    res.print()
    res.save_to_img(save_path="./output/")
    res.save_to_json(save_path="./output/res.json")

After running, the obtained result is as follows:

{'res': {'input_path': '/root/.paddlex/predict_input/4ifXaBJmFByG_mAnF86Vv.png', 'page_index': None, 'class_ids': array([2], dtype=int32), 'scores': array([0.90971], dtype=float32), 'label_names': ['180']}}

The visualized image is as follows:

image/jpeg

For details about usage command and descriptions of parameters, please refer to the Document.

Advanced Usage

The ability of a single model is limited. But the pipeline consists of several models can provide more capacity to resolve difficult problems in real - world scenarios.

doc_preprocessor

The Document Image Preprocessing Pipeline integrates two key functions: document orientation classification and geometric distortion correction. The document orientation classification module automatically identifies the four possible orientations of a document (0°, 90°, 180°, 270°), ensuring that the document is processed in the correct direction. The text image unwarping model is designed to correct geometric distortions that occur during document photography or scanning, restoring the document's original shape and proportions. This pipeline is suitable for digital document management, preprocessing tasks for OCR, and any scenario requiring improved document image quality. By automating orientation correction and geometric distortion correction, this module significantly enhances the accuracy and efficiency of document processing, providing a more reliable foundation for image analysis. The pipeline also offers flexible service - oriented deployment options, supporting calls from various programming languages on multiple hardware platforms. Additionally, the pipeline supports secondary development, allowing you to fine - tune the models on your own datasets and seamlessly integrate the trained models. And there are 2 modules in the pipeline:

Document Image Orientation Classification Module (Optional)
Text Image Unwarping Module (Optional)

Run a single command to quickly experience the OCR pipeline:

paddleocr doc_preprocessor -i https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/pY6sY6wLDuoHF1-cGUvDr.png \
    --use_doc_orientation_classify True \
    --use_doc_unwarping True \
    --doc_orientation_classify_model_name PP-LCNet_x1_0_doc_ori \
    --save_path ./output \
    --device gpu:0

Results are printed to the terminal:

{'res': {'input_path': '/root/.paddlex/predict_input/pY6sY6wLDuoHF1-cGUvDr.png', 'page_index': None, 'model_settings': {'use_doc_orientation_classify': True, 'use_doc_unwarping': True}, 'angle': 180}}

If save_path is specified, the visualization results will be saved under save_path. The visualization output is shown below:

image/jpeg

The command - line method is for quick experience. For project integration, also only a few codes are needed as well:

from paddleocr import DocPreprocessor  

ocr = DocPreprocessor(
    doc_orientation_classify_model_name="PP-LCNet_x1_0_doc_ori",
    use_doc_orientation_classify=True, # Use use_doc_orientation_classify to enable/disable document orientation classification model
    use_doc_unwarping=True, # Use use_doc_unwarping to enable/disable document unwarping module
    device="gpu:0", # Use device to specify GPU for model inference
)
result = ocr.predict("https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/pY6sY6wLDuoHF1-cGUvDr.png")  
for res in result:  
    res.print()  
    res.save_to_img("output")  
    res.save_to_json("output")

📚 Documentation

Introduction

The Document Image Orientation Classification Module is primarily designed to distinguish the orientation of document images and correct them through post - processing. During processes such as document scanning or ID photo capturing, the device might be rotated to achieve clearer images, resulting in images with various orientations. Standard OCR pipelines may not handle these images effectively. By leveraging image classification techniques, the orientation of documents or IDs containing text regions can be pre - determined and adjusted, thereby improving the accuracy of OCR processing. The key accuracy metrics are as follow:

Property	Details
Model Type	A document image classification model based on PP - LCNet_x1_0, with four categories: 0°, 90°, 180°, and 270°.
Recognition Avg Accuracy(%)	99.06
Model Storage Size (M)	7

📄 License

This project is licensed under the Apache-2.0 license.

🔗 Links

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご