MathCoder-VL-8B開源多模態模型 - 免費助力解決通用數學問題，增強推理能力！

首頁

Mathcoder VL 8B

由MathLLMs開發

MathCoder-VL系列開源大型多模態模型，專為通用數學問題解決而設計，結合視覺與代碼增強數學推理能力。

圖像生成文本

Transformers

英語開源協議:Apache-2.0 #多模態數學推理 #圖像轉代碼 #幾何圖表解析

下載量 17

發布時間 : 5/15/2025

模型概述

MathCoder-VL是一個多模態大型模型，專注於解決通用數學問題，通過連接視覺與代碼來增強數學推理能力。

模型特點

多模態數學推理

結合視覺與文本信息進行數學問題求解，支持圖表、幾何圖形等多種數學表達形式。

代碼增強推理

通過代碼生成與執行增強數學推理能力，支持數學問題的程序化求解。

廣泛數學領域覆蓋

支持幾何、代數、函數圖、科學圖表等多種數學領域的推理任務。

模型能力

多模態數學推理

圖像文本轉換

數學問題求解

圖表理解

幾何推理

代碼生成

使用案例

教育

數學教材問題解答

幫助學生理解並解答教材中的數學問題，包括圖表和文字描述。

提高學習效率，增強數學理解能力。

幾何圖形推理

通過幾何圖形進行推理和問題求解，如角度計算、面積求解等。

準確解答幾何問題，輔助幾何學習。

科研

科學圖表分析

分析科學實驗中的圖表數據，提取關鍵信息並進行推理。

輔助科研人員進行數據分析和解釋。

🚀 MathCoder-VL：連接視覺與代碼，提升多模態數學推理能力

MathCoder-VL 是一系列專門為解決通用數學問題而設計的開源大型多模態模型（LMMs）。同時，還推出了圖像轉代碼模型 FigCodifier-8B。

倉庫鏈接

論文鏈接

🚀 快速開始

模型信息

屬性	詳情
模型類型	image-text-to-text
評估指標	accuracy
標籤	mathematics、reasoning、multi-modal-qa、math-qa、figure-qa、geometry-qa、math-word-problem、textbook-qa、vqa、geometry-diagram、synthetic-scene、chart、plot、scientific-figure、table、function-plot、abstract-scene、puzzle-test、document-image、science
庫名稱	transformers
基礎模型	OpenGVLab/InternVL2-8B
數據集	MathLLMs/MM-MathInstruct
許可證	apache-2.0

模型對比

基礎模型	本項目模型
Mini-InternVL-Chat-2B-V1-5	MathCoder-VL-2B
InternVL2-8B	MathCoder-VL-8B
InternVL2-8B	FigCodifier-8B

使用示例

訓練和推理代碼請參考 InternVL。

基礎用法

from datasets import load_dataset
from PIL import Image
from io import BytesIO

mm_mathinstruct = load_dataset("MathLLMs/MM-MathInstruct")
print(mm_mathinstruct)

# show the last image
img = Image.open(BytesIO(mm_mathinstruct['train'][-1]['image']))
img.show()

運行上述代碼後，應該會輸出：

DatasetDict({
    train: Dataset({
        features: ['id', 'image', 'question', 'solution', 'image_path'],
        num_rows: 2871988
    })
})

📚 詳細文檔

動機

FigCodifier 的構建

MathCoder-VL 的構建

性能表現

📄 許可證

本項目採用 apache-2.0 許可證。

📖 引用

如果您使用了我們的數據、模型或代碼，請引用以下論文：

@inproceedings{
wang2025mathcodervl,
title={MathCoder-{VL}: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning},
author={Ke Wang and Junting Pan and Linda Wei and Aojun Zhou and Weikang Shi and Zimu Lu and Han Xiao and Yunqiao Yang and Houxing Ren and Mingjie Zhan and Hongsheng Li},
booktitle={The 63rd Annual Meeting of the Association for Computational Linguistics},
year={2025},
url={https://openreview.net/forum?id=nuvtX1imAb}
}

@inproceedings{
lu2025mathcoder2,
title={MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code},
author={Zimu Lu and Aojun Zhou and Ke Wang and Houxing Ren and Weikang Shi and Junting Pan and Mingjie Zhan and Hongsheng Li},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=1Iuw1jcIrf}
}

@inproceedings{
wang2024mathcoder,
title={MathCoder: Seamless Code Integration in {LLM}s for Enhanced Mathematical Reasoning},
author={Ke Wang and Houxing Ren and Aojun Zhou and Zimu Lu and Sichun Luo and Weikang Shi and Renrui Zhang and Linqi Song and Mingjie Zhan and Hongsheng Li},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=z8TW0ttBPp}
}