đ Model Card: Pix2Text-MFR
A Mathematical Formula Recognition (MFR) model from Pix2Text (P2T), capable of converting images of mathematical formulas into LaTeX text representation.
⨠Features
- Utilizes the TrOCR architecture developed by Microsoft, retrained on a dataset of mathematical formula images.
- Can convert input images of mathematical formulas into LaTeX text representation.
- The Pix2Text V1.0 MFR open - source free version model outperforms previous paid models, and the paid model has even higher precision.
đĻ Installation
Method 1: Using the model Directly
Method 2: Using Pix2Text
$ pip install pix2text>=1.1
đģ Usage Examples
Basic Usage
Method 1: Using the model Directly
This method doesn't need to install pix2text, but can only recognize pure formula images.
from PIL import Image
from transformers import TrOCRProcessor
from optimum.onnxruntime import ORTModelForVision2Seq
processor = TrOCRProcessor.from_pretrained('breezedeus/pix2text-mfr')
model = ORTModelForVision2Seq.from_pretrained('breezedeus/pix2text-mfr', use_cache=False)
image_fps = [
'examples/example.jpg',
'examples/42.png',
'examples/0000186.png',
]
images = [Image.open(fp).convert('RGB') for fp in image_fps]
pixel_values = processor(images=images, return_tensors="pt").pixel_values
generated_ids = model.generate(pixel_values)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)
print(f'generated_ids: {generated_ids}, \ngenerated text: {generated_text}')
Method 2: Using Pix2Text
This method requires the installation of pix2text, utilizing the Mathematical Formula Detection model (MFD) within Pix2Text. It is capable of recognizing not only pure formula images but also mixed images containing text.
from pix2text import Pix2Text, merge_line_texts
image_fps = [
'examples/example.jpg',
'examples/42.png',
'examples/0000186.png',
]
p2t = Pix2Text.from_config()
outs = p2t.recognize_formula(image_fps)
outs2 = p2t.recognize('examples/mixed.jpg', file_type='text_formula', return_text=True, save_analysis_res='mixed-out.jpg')
print(outs2)
Method 3: Notebook
Just try Pix2Text with this notebook: https://github.com/breezedeus/Pix2Text/blob/main/pix2text_v1_1.ipynb.
đ Documentation
đ§ Technical Details
This MFR model utilizes the TrOCR architecture developed by Microsoft. It starts with the initial values of TrOCR and is retrained using a dataset of mathematical formula images. The resulting model can convert images of mathematical formulas into LaTeX text representation. More detailed information can be found: Pix2Text V1.0 New Release: The Best Open - Source Formula Recognition Model | Breezedeus.com.
đ License
This project is licensed under the MIT license.
đ Examples
Printed Math Formula Images

Handwritten Math Formula Images

đ Performance
The test dataset is derived from real data uploaded by users on the Pix2Text Online Service. First, real user data from a specific period is selected. Then, the Mathematical Formula Detection model (MFD) within Pix2Text is used to detect the mathematical formulas in these images and crop the corresponding parts. A subset of these formula images is randomly chosen for manual annotation to create the test dataset, which includes 485
images.

The following shows the Character Error Rates (CER, the lower, the better) of various models on this test dataset. The true annotated results and the output of each model are first normalized to ensure that irrelevant factors such as spaces do not affect the test outcomes. For the recognition results of Texify, the leading and trailing symbols $
or $$
of the formula are removed first.

As can be seen from the figure above, the Pix2Text V1.0 MFR open - source free version model has significantly outperformed the previous versions of the paid model. Moreover, compared to the V1.0 MFR open - source free model, the precision of the Pix2Text V1.0 MFR paid model has been further improved.
Texify is more suited for recognizing images with standard formatting. It performs poorly in recognizing images containing single letters. This is the main reason why Texify's performance on this test dataset is inferior to that of Latex - OCR.
đ Feedback
Welcome to contact the author Breezedeus if you have any questions or comments about the model.