japan_PP-OCRv3_mobile_rec Open-source Japanese text line recognition model - Free support for Japanese digit recognition

Japan PP OCRv3 Mobile Rec

Developed by PaddlePaddle

An ultra-lightweight Japanese text line recognition model developed by the PaddleOCR team, supporting the recognition of Japanese and numeric characters.

Text Recognition Supports Multiple LanguagesOpen Source License:Apache-2.0 #Japanese OCR #Ultra-lightweight #Optimized for mobile devices

Downloads 155

Release Time : 6/6/2025

Model Overview

This model is a specific model trained for Japanese recognition based on PP-OCRv3_mobile_rec, suitable for Japanese text recognition tasks.

Model Features

Lightweight design

The model storage size is only 8.8MB, suitable for deployment on mobile and embedded devices.

Japanese-specific

Specifically optimized and trained for Japanese text recognition, supporting the recognition of Japanese and numeric characters.

Strict evaluation criteria

Adopts strict evaluation criteria of completely correct whole-line characters to ensure high accuracy in practical applications.

Model Capabilities

Japanese text recognition

Numeric character recognition

Text line detection

Use Cases

Document processing

Japanese document OCR

Recognize the text content in scanned or photographed Japanese documents.

Accurately extract Japanese text information from documents.

Mobile applications

Instant translation of Japanese text

Recognize and translate Japanese text in real-time on mobile devices.

Implement the instant translation function of Japanese text.

🚀 japan_PP-OCRv3_mobile_rec

japan_PP-OCRv3_mobile_rec is a text line recognition model in the PP-OCRv3_rec series, developed by the PaddleOCR team. It's a Japan-specific model based on PP-OCRv3_mobile_rec, supporting Japanese recognition.

🚀 Quick Start

✨ Features

High - accuracy: The model has a recognition average accuracy of 45.69%, ensuring reliable results in practical applications.
Ultra - lightweight: With a model storage size of only 8.8 M, it is easy to deploy and use.
Multilingual support: Supports Japanese and numeric character recognition.

📦 Installation

PaddlePaddle Install PaddlePaddle using pip with the following commands:

# for CUDA11.8
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/

# for CUDA12.6
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/

# for CPU
python -m pip install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/

For more details on PaddlePaddle installation, refer to the PaddlePaddle official website.

PaddleOCR Install the latest version of the PaddleOCR inference package from PyPI:
```
python -m pip install paddleocr
```

💻 Usage Examples

Basic Usage

You can quickly test the model with a single command:

paddleocr text_recognition \
    --model_name japan_PP-OCRv3_mobile_rec \
    -i https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/NnQK6B5BHvbHgnx9EHBZa.png

You can also integrate the model into your project. First, download the sample image to your local machine. Then run the following code:

from paddleocr import TextRecognition
model = TextRecognition(model_name="japan_PP-OCRv3_mobile_rec")
output = model.predict(input="NnQK6B5BHvbHgnx9EHBZa.png", batch_size=1)
for res in output:
    res.print()
    res.save_to_img(save_path="./output/")
    res.save_to_json(save_path="./output/res.json")

After running, the result is as follows:

{'res': {'input_path': '/root/.paddlex/predict_input/NnQK6B5BHvbHgnx9EHBZa.png', 'page_index': None, 'rec_text': '学校が終わってから、友達と遊んじゃったの。それで、帰るのが少', 'rec_score': 0.999738335609436}}

The visualized image: image/jpeg

For more details on usage commands and parameter descriptions, refer to the Document.

Advanced Usage

The ability of a single model is limited. A pipeline consisting of several models can handle more complex real - world problems.

PP - OCRv3

The general OCR pipeline extracts text information from images and outputs it as strings. It has 5 modules:

Document Image Orientation Classification Module (Optional)
Text Image Unwarping Module (Optional)
Text Line Orientation Classification Module (Optional)
Text Detection Module
Text Recognition Module

Run the following command to quickly test the OCR pipeline:

paddleocr ocr -i https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/yoS5sCp5dVQUAPWFQlZX8.png \
    --text_recognition_model_name japan_PP-OCRv3_mobile_rec \
    --use_doc_orientation_classify False \
    --use_doc_unwarping False \
    --use_textline_orientation True \
    --save_path ./output \
    --device gpu:0

The results will be printed in the terminal:

{'res': {'input_path': '/root/.paddlex/predict_input/yoS5sCp5dVQUAPWFQlZX8.png', 'page_index': None, 'model_settings': {'use_doc_preprocessor': True, 'use_textline_orientation': True}, 'doc_preprocessor_res': {'input_path': None, 'page_index': None, 'model_settings': {'use_doc_orientation_classify': False, 'use_doc_unwarping': False}, 'angle': -1}, 'dt_polys': array([[[ 65,   9],
        ...,
        [ 65,  39]],

       ...,

       [[ 34, 211],
        ...,
        [ 34, 245]]], dtype=int16), 'text_det_params': {'limit_side_len': 64, 'limit_type': 'min', 'thresh': 0.3, 'max_side_limit': 4000, 'box_thresh': 0.6, 'unclip_ratio': 1.5}, 'text_type': 'general', 'textline_orientation_angles': array([0, ..., 0]), 'text_rec_score_thresh': 0.0, 'rec_texts': ['彼女のいいたいことが、正晴にもなんとなくわかった。その一時', '間というのには、重大な意味があるのだ。', 'Iもしそうしていたら」雪穂はいったん唇を噛んでから続け', 'た。「そうしていたら、たぶんおかあさんは死なずに済んだと思', 'う。それを思うと・'], 'rec_scores': array([0.98636633, ..., 0.94151145]), 'rec_polys': array([[[ 65,   9],
        ...,
        [ 65,  39]],

       ...,

       [[ 34, 211],
        ...,
        [ 34, 245]]], dtype=int16), 'rec_boxes': array([[ 65, ...,  39],
       ...,
       [ 34, ..., 245]], dtype=int16)}}

If save_path is specified, the visualization results will be saved under save_path. The visualization output: image/jpeg

For project integration, you can use the following code:

from paddleocr import PaddleOCR  

ocr = PaddleOCR(
    text_recognition_model_name="japan_PP-OCRv3_mobile_rec",
    use_doc_orientation_classify=False, # Use use_doc_orientation_classify to enable/disable document orientation classification model
    use_doc_unwarping=False, # Use use_doc_unwarping to enable/disable document unwarping module
    use_textline_orientation=True, # Use use_textline_orientation to enable/disable textline orientation classification model
    device="gpu:0", # Use device to specify GPU for model inference
)
result = ocr.predict("https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/yoS5sCp5dVQUAPWFQlZX8.png")  
for res in result:  
    res.print()  
    res.save_to_img("output")  
    res.save_to_json("output")

The default model in the pipeline is PP - OCRv5_server_rec. You need to specify japan_PP-OCRv3_mobile_rec using the text_recognition_model_name argument. You can also use a local model file with the text_recognition_model_dir argument. For more details on usage commands and parameter descriptions, refer to the Document.

📚 Documentation

The model's key accuracy metrics are as follows:

Property	Details
Model Type	japan_PP-OCRv3_mobile_rec
Recognition Avg Accuracy(%)	45.69
Model Storage Size	8.8 M
Description	An ultra - lightweight Japanese recognition model trained based on the PP - OCRv3 recognition model, supporting Japanese and numeric character recognition.

⚠️ Important Note

If any character (including punctuation) in a line was incorrect, the entire line was marked as wrong. This ensures higher accuracy in practical applications.

📄 License

This project is licensed under the Apache - 2.0 license.

🔗 Links

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご