pix2text-mfr-quantized Open Source Model - Free Convert Math Formula Images to LaTeX Text

Pix2text Mfr Quantized

Developed by Brian314

Pix2Text's Mathematical Formula Recognition (MFR) model, trained based on the TrOCR architecture, can convert mathematical formula images into LaTeX text representations.

Text Recognition

Transformers

Open Source License:MIT #Mathematical Formula Recognition #LaTeX Conversion #Printed and Handwritten Compatibility

Downloads 37

Release Time : 6/18/2024

Model Overview

This model specializes in mathematical formula recognition tasks, capable of processing both printed and handwritten mathematical formula images and converting them into LaTeX-formatted text representations.

Model Features

High-precision Formula Recognition

Outperforms similar open-source models on test datasets with a lower Character Error Rate (CER)

Supports Multiple Formula Types

Capable of recognizing various mathematical expressions, from simple formulas to complex matrices

Printed and Handwritten Compatibility

Can process both standard printed formulas and handwritten formula images

Model Capabilities

Convert mathematical formula images to LaTeX text

Printed formula recognition

Handwritten formula recognition

Complex mathematical expression processing

Use Cases

Education

Digitizing Math Homework

Convert students' handwritten math homework into editable LaTeX format

Facilitates teacher grading and student revisions

Online Learning Platforms

Provide formula recognition functionality for online education platforms

Enhances the platform's ability to handle mathematical content

Academic Research

Extracting Formulas from Papers

Extract mathematical formulas from academic papers

Facilitates literature retrieval and analysis

🚀 Model Card: Pix2Text-MFR

A Mathematical Formula Recognition (MFR) model from Pix2Text (P2T), capable of converting images of mathematical formulas into LaTeX text representation.

✨ Features

Utilizes the TrOCR architecture developed by Microsoft, retrained on a dataset of mathematical formula images.
Can convert input images of mathematical formulas into LaTeX text representation.
The Pix2Text V1.0 MFR open - source free version model outperforms previous paid models, and the paid model has even higher precision.

📦 Installation

Method 1: Using the model Directly

#! pip install transformers>=4.37.0 pillow optimum[onnxruntime]

Method 2: Using Pix2Text

$ pip install pix2text>=1.1

#! pip install pix2text>=1.1

💻 Usage Examples

Basic Usage

Method 1: Using the model Directly

This method doesn't need to install pix2text, but can only recognize pure formula images.

from PIL import Image
from transformers import TrOCRProcessor
from optimum.onnxruntime import ORTModelForVision2Seq

processor = TrOCRProcessor.from_pretrained('breezedeus/pix2text-mfr')
model = ORTModelForVision2Seq.from_pretrained('breezedeus/pix2text-mfr', use_cache=False)

image_fps = [
    'examples/example.jpg',
    'examples/42.png',
    'examples/0000186.png',
]
images = [Image.open(fp).convert('RGB') for fp in image_fps]
pixel_values = processor(images=images, return_tensors="pt").pixel_values
generated_ids = model.generate(pixel_values)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)
print(f'generated_ids: {generated_ids}, \ngenerated text: {generated_text}')

Method 2: Using Pix2Text

This method requires the installation of pix2text, utilizing the Mathematical Formula Detection model (MFD) within Pix2Text. It is capable of recognizing not only pure formula images but also mixed images containing text.

from pix2text import Pix2Text, merge_line_texts

image_fps = [
    'examples/example.jpg',
    'examples/42.png',
    'examples/0000186.png',
]
p2t = Pix2Text.from_config()
outs = p2t.recognize_formula(image_fps)  # recognize pure formula images

outs2 = p2t.recognize('examples/mixed.jpg', file_type='text_formula', return_text=True, save_analysis_res='mixed-out.jpg')  # recognize mixed images
print(outs2)

Method 3: Notebook

Just try Pix2Text with this notebook: https://github.com/breezedeus/Pix2Text/blob/main/pix2text_v1_1.ipynb.

📚 Documentation

Pix2Text V1.0 New Release: The Best Open - Source Formula Recognition Model | Breezedeus.com ;
Pix2Text (P2T) Github: breezedeus/pix2text ;
Pix2Text Online Free Service: p2t.breezedeus.com ;
Pix2Text Online Docs: Docs ;
Pix2Text More: breezedeus.com/pix2text ;
Pix2Text Discard: https://discord.gg/GgD87WM8Tf

🔧 Technical Details

This MFR model utilizes the TrOCR architecture developed by Microsoft. It starts with the initial values of TrOCR and is retrained using a dataset of mathematical formula images. The resulting model can convert images of mathematical formulas into LaTeX text representation. More detailed information can be found: Pix2Text V1.0 New Release: The Best Open - Source Formula Recognition Model | Breezedeus.com.

📄 License

This project is licensed under the MIT license.

📈 Examples

Printed Math Formula Images

printed - formula examples

Handwritten Math Formula Images

handwritten - formula examples

📊 Performance

The test dataset is derived from real data uploaded by users on the Pix2Text Online Service. First, real user data from a specific period is selected. Then, the Mathematical Formula Detection model (MFD) within Pix2Text is used to detect the mathematical formulas in these images and crop the corresponding parts. A subset of these formula images is randomly chosen for manual annotation to create the test dataset, which includes 485 images.

Examples from test data

The following shows the Character Error Rates (CER, the lower, the better) of various models on this test dataset. The true annotated results and the output of each model are first normalized to ensure that irrelevant factors such as spaces do not affect the test outcomes. For the recognition results of Texify, the leading and trailing symbols $ or $$ of the formula are removed first.

CER Comparison Among Different MFR Models

As can be seen from the figure above, the Pix2Text V1.0 MFR open - source free version model has significantly outperformed the previous versions of the paid model. Moreover, compared to the V1.0 MFR open - source free model, the precision of the Pix2Text V1.0 MFR paid model has been further improved.

Texify is more suited for recognizing images with standard formatting. It performs poorly in recognizing images containing single letters. This is the main reason why Texify's performance on this test dataset is inferior to that of Latex - OCR.

💌 Feedback

Welcome to contact the author Breezedeus if you have any questions or comments about the model.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご