PP-OCRv4_mobile_det开源文本检测模型 - 适配移动设备，支持边缘设备部署

Home

PP OCRv4 Mobile Det

Developed by PaddlePaddle

PP-OCRv4_mobile_det 是由 PaddleOCR 团队开发的针对移动设备优化的高效文本检测模型，适合边缘设备部署。

文字识别 Supports Multiple LanguagesOpen Source License:Apache-2.0 #移动端文本检测 #多语言OCR #轻量化部署

Downloads 360

Release Time : 6/6/2025

Model Overview

PP-OCRv4_mobile_det 是 PP-OCRv4_det 系列模型之一，专注于文本检测任务，特别优化了在移动设备上的运行效率。

Model Features

移动设备优化

专为移动设备设计，具有高效的运行性能，适合边缘设备部署。

多语言支持

支持多种语言的文本检测，包括中文、英文、日文等。

高准确率

在多种文本类型（印刷、手写、艺术字等）上表现出色。

Model Capabilities

文本检测

多语言文本识别

移动设备部署

Use Cases

文档处理

印刷文档文本检测

用于检测印刷文档中的文本区域，支持多种语言。

高准确率的文本检测结果。

移动应用

移动端OCR应用

在移动设备上实现高效的文本检测功能。

快速、准确的文本检测，适合实时应用。

🚀 PP-OCRv4_mobile_det

PP-OCRv4_mobile_det 是 PP-OCRv4_det 系列模型之一，这是由 PaddleOCR 团队开发的一组文本检测模型。这款针对移动设备优化的文本检测模型效率更高，非常适合在边缘设备上部署。其主要的准确率指标如下：

手写中文	手写英文	印刷中文	印刷英文	繁体中文	古文	日文	通用场景	拼音	旋转文本	扭曲文本	艺术字	平均
0.583	0.369	0.872	0.773	0.663	0.231	0.634	0.710	0.430	0.299	0.715	0.549	0.624

🚀 快速开始

📦 安装指南

1. 安装 PaddlePaddle

请参考以下命令，使用 pip 安装 PaddlePaddle：

# 适用于 CUDA11.8
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/

# 适用于 CUDA12.6
python -m pip install paddlepaddle-gpu==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/

# 适用于 CPU
python -m pip install paddlepaddle==3.0.0 -i https://www.paddlepaddle.org.cn/packages/stable/cpu/

有关 PaddlePaddle 安装的详细信息，请参考 PaddlePaddle 官方网站。

2. 安装 PaddleOCR

从 PyPI 安装最新版本的 PaddleOCR 推理包：

python -m pip install paddleocr

💻 使用示例

基础用法

你可以使用单条命令快速体验该功能：

paddleocr text_detection \
    --model_name PP-OCRv4_mobile_det \
    -i https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/3ul2Rq4Sk5Cn-l69D695U.png

你也可以将文本检测模块的模型推理集成到你的项目中。在运行以下代码之前，请将示例图像下载到本地机器。

from paddleocr import TextDetection
model = TextDetection(model_name="PP-OCRv4_mobile_det")
output = model.predict(input="3ul2Rq4Sk5Cn-l69D695U.png", batch_size=1)
for res in output:
    res.print()
    res.save_to_img(save_path="./output/")
    res.save_to_json(save_path="./output/res.json")

运行后，得到的结果如下：

{'res': {'input_path': '/root/.paddlex/predict_input/3ul2Rq4Sk5Cn-l69D695U.png', 'page_index': None, 'dt_polys': array([[[ 637, 1432],
        ...,
        [ 637, 1454]],

       ...,

       [[ 356,  107],
        ...,
        [ 356,  130]]], dtype=int16), 'dt_scores': [0.8305358711080322, 0.6912752452425651, ..., 0.848925772091929]}}

可视化后的图像如下：

image/jpeg

有关使用命令和参数说明的详细信息，请参考文档。

高级用法

单个模型的能力是有限的。但由多个模型组成的管道可以提供更强的能力来解决现实场景中的难题。

PP-OCRv4

通用 OCR 管道用于通过从图像中提取文本信息并以文本形式输出，从而解决文本识别任务。该管道包含 5 个模块：

文档图像方向分类模块（可选）
文本图像矫正模块（可选）
文本行方向分类模块（可选）
文本检测模块
文本识别模块

运行单条命令快速体验 OCR 管道：

paddleocr ocr -i https://cdn-uploads.huggingface.co/production/uploads/681c1ecd9539bdde5ae1733c/3ul2Rq4Sk5Cn-l69D695U.png \
    --text_detection_model_name PP-OCRv4_mobile_det \
    --text_recognition_model_name PP-OCRv4_mobile_rec \
    --use_doc_orientation_classify False \
    --use_doc_unwarping False \
    --use_textline_orientation False \
    --save_path ./output \
    --device gpu:0

结果将打印到终端：

{'res': {'input_path': '/root/.paddlex/predict_input/3ul2Rq4Sk5Cn-l69D695U.png', 'page_index': None, 'model_settings': {'use_doc_preprocessor': True, 'use_textline_orientation': False}, 'doc_preprocessor_res': {'input_path': None, 'page_index': None, 'model_settings': {'use_doc_orientation_classify': False, 'use_doc_unwarping': False}, 'angle': -1}, 'dt_polys': array([[[ 356,  105],
        ...,
        [ 356,  129]],

       ...,

       [[ 630, 1432],
        ...,
        [ 630, 1451]]], dtype=int16), 'text_det_params': {'limit_side_len': 64, 'limit_type': 'min', 'thresh': 0.3, 'max_side_limit': 4000, 'box_thresh': 0.6, 'unclip_ratio': 1.5}, 'text_type': 'general', 'textline_orientation_angles': array([-1, ..., -1]), 'text_rec_score_thresh': 0.0, 'rec_texts': ['AlgorithmsfortheMarkovEntropyDecomposition', 'AndrewJ.FerrisandDavidPoulin', 'DepartementdePhysique,UniversitedeSherbrooke，Quebec,J1K2R1，Canada', '(Dated:October 31,2018)', 'TheMarkoventropydecomposition(MED)isarecently-proposed,cluster-basedsimulationmethodforfi-', 'nite temperature quantum systems with arbitrary geometry. In this paper, we detail numerical algorithms for', 'performingtherequiredsteps oftheMED,principallysolvingaminimizationproblemwithapreconditioned', '2107', "Newton's algorithm, as well as how to extract global susceptibilities and thermal responses. We demonstrate", 'thepowerof themethodwiththespin-1/2XXZmodelonthe2Dsquarelattice,includingtheextractionof', 'criticalpointsanddetailsofeachphase.Althoughthemethodsharessomequalitativesimilaritieswithexact-', 'diagonalization,we show the MED is both more accurate and significantly more fexible', '', 'PACS numbers: 05.10.a, 02.50.Ng, 03.67.a, 74.40.Kb', '6', '1', 'INTRODUCTION', 'This approximation becomes exactin the case of a1Dquan', 'tum (or classical)Markov chain[10],and leads to an expo', 'g', 'Althoughtheequationsgoverningquantummany-body', 'nentialreduction of costfor exact entropy calculationswhen', 'C', 'systemsares', 'simpletowritedown,findingsolutionsforthe', 'theglobaldensitymatrixisahigher-dimensionalMarkovnet-', 'H', 'majorityof systems remainsincrediblydifficult.Modern', 'work state[12,13].', 'physicsfinds itself inneedof new tools tocompute theemer-', 'Thesecond approximationused intheMEDapproach is', 'gent behavioroflarge,many-body systems.', 'relatedtotheN-representibilityproblem.Givenasetoflo', '', 'T', 'Therehasbeen a greatvariety of tools developed totackle', 'calbut overlappingreduceddensitymatrices{pi},itis avery', 'many-body problems,but in general, large 2D and 3D quan-', 'challengingproblemtodetermineifthereexistsaglobalden', '1', 'tumsystemsremainhardtodealwith.N', 'Mostsystemsare', 'sityoperatorwhichispositivesemi-definiteandwhosepartial', 'thoughttobenon-integrable,soexactanalyticsolutionsare', 'trace agreeswitheachpi.This problemis QMA-hard(the', 'notusuallyexpected.Directnumericaldiagonalizationcanbe', 'quantum analogue of NP)[14,15],and is hopelessly diffi', 'performedforrelativelysmallsystems', 'howevertheemer', 'cult toenforce.Thus,the second approximationemployed', 'gentbehaviorofasysteminthethermodynamiclimitmaybe', 'involves ignoringglobal consistency with apositive opera', 'difficulttoextract,especiallyins', 'systemswithlargecorrelation', 'tor,whilerequiringlocal consistency on any overlappingre', 'lengths.MonteCarloapproachesaretechnicallyexact(upto', 'gionsbetweenthep.Atthezero-temperaturelimit,theMED', 'samplingerror),butsufferfromtheso-calledsignproblem', 'approachbecomesanalogoustothevariationalnth-orderre-', 'forfermionic,frustrated,or dynamical problems.Thus we are', 'duceddensitymatrix', 'approach,wherepositivityisenforced', 'limited to search for clever approximations to solve the ma-', 'on allreduceddensitymatricesofsizen[16-18].', 'jorityofmany-bodyproblems', 'TheMEDapproachisanextremelyflexibleclustermethod', 'Over thepastcentury,hundredsof suchapproximations', 'applicabletobothtranslationallyinvariantsystemsofanydi', 'havebeenproposed,andwewillmentionjustafewnotable', 'mensioninthethermodynamiclimit,aswellasfinitesystems', 'examplesapplicabletoquantumlatticemodels.Mean-field', 'or systems without translationalinvariance(e.g.disordered', 'theoryiss', 'simplea', 'andfrequentlyarrivesatthecorrectquali', 'lattices,orharmonicallyt', 'trappeda', 'atomsinopticallattices)', 'tativedescription,butoftenfailswhencorrelationsareim', 'The free energy given by MED is guaranteed to lower bound', 'portant. Density-matrix renormalisation group (DMRG)[1]', 'the true free energy,which in turn lower-bounds the ground', 'is efficient and extremely accurate atsolving1Dproblems', 'stateenergy—t', 'thusprovidinganaturalcomplementtovaria', 'butthecomputationalcostgrowsexponentiallywithsystem', 'tional approacheswhichupper-bound thegroundstateenergy', 'sizeintwo-or higher-dimensions[2,3].F', 'Relatedtensor', 'Theabilitytoprovidearigorousground-stateenergywindow', 'networktechniquesdesignedfor2Dsystemsarestillinthein', 'is apowerfulvalidation tool,creating avery compellingrea-', 'infancy[4-6].Series-expansionmethods[7]canbesuccess-', 'son tousethis approach', 'ful,but may diverge or otherwise converge slowly,obscuring', 'Inthispaperwepaperwepresent apedagogicalintroduc', 'thestateincertainregimes.', 'Thereexistavarietyofcluster', 'tiontoMED,includingnumericalimplementationissuesand', 'basedtechniques,suchasdynamical-mean-fieldtheory[8]', 'applicationsto2Dquantumlatticemodelsinthethermody', 'anddensity-matrixembedding[9]', 'namiclimit.In Sec.I', 'II,wegiveabrief', 'derivationofthe', 'Herewe discuss theso-calledMarkoventropydecompo-', 'Markoventropydecomposition.SectionII outlines arobust', 'sition(MED),recentlyproposed byPoulin&Hastings [10]', 'numericalstrategyfor optimizingtheclusters thatmakeup', '(and analogoustoaslightlyearlier classical algorithm[11])', 'thedecomposition.InSec.IVweshowhowwecanextend', 'Thisisaself-consistentclustermethodforfinite temperature', 'thesealgorithmstoextractnon-trivialinformation,suchas', 'systems that takes advantage of an approximation of the(von', 'specificheat andsusceptibilities.Wepresentan application of', 'Neumann)entropy.In[1o],it was shown that the entropy', 'themethod to the spin-1/2XXZmodelon a 2Dsquarelattice', 'persitecanberigorouslyupperboundedusingonlylocalin-', 'inSec.V,describinghowtocharacterizethephasediagram', 'formation—alocal,reduced density matrix on Nsites,say.', '', 'and determine criticalpoints,before concluding inSec.VI.'], 'rec_scores': array([0.9952876 , ..., 0.95561302]), 'rec_polys': array([[[ 356,  105],
        ...,
        [ 356,  129]],

       ...,

       [[ 630, 1432],
        ...,
        [ 630, 1451]]], dtype=int16), 'rec_boxes': array([[ 356, ...,  130],
       ...,
       [ 630, ..., 1451]], dtype=int16)}}

如果指定了 save_path，可视化结果将保存到 save_path 目录下。可视化输出如下：

image/jpeg

命令行方法适用于快速体验。对于项目集成，也只需要几行代码：

from paddleocr import PaddleOCR  

ocr = PaddleOCR(
    text_detection_model_name="PP-OCRv4_mobile_det",
    text_recognition_model_name="PP-OCRv4_mobile_rec",
    use_doc_orientation_classify=False, # 通过此参数禁用文档方向分类模型
    use_doc_unwarping=False, # 通过此参数禁用文本图像矫正模型
    use_textline_orientation=False, # 通过此参数禁用文本行方向分类模型
)
result = ocr.predict("./3ul2Rq4Sk5Cn-l69D695U.png")  
for res in result:  
    res.print()  
    res.save_to_img("output")  
    res.save_to_json("output")