MMR1-Math-v0-7B开源多模态模型 - 解决数学任务，展现最先进性能！

首页

MMR1 Math V0 7B

由 MMR1 开发

专注于数学任务的大型多模态模型，在开源7B多模态模型中实现最先进的性能

文本生成图像

Transformers

英语开源协议:Apache-2.0 #数学多模态推理 #小样本高效训练 #7B参数SOTA

下载量 75

发布时间 : 3/11/2025

模型简介

MMR1-Math-v0-7B是基于Qwen2.5-VL-7B-Instruct构建的多模态大模型，专注于数学推理任务。该模型仅使用6k精选数据样本训练即达到SOTA性能，在多个数学推理基准测试上表现优异。

模型特点

SOTA性能

在开源7B多模态模型中创下数学任务的新标杆

高效训练

仅需6k高质量样本和6小时RL训练即可达到顶级表现

数据策略

基于任务难度和数学推理多样性进行均匀采样的高质量公开数据

GRPO训练

使用64张H100显卡进行高效RL训练（15个epoch）

模型能力

多模态数学推理

图像文本理解

复杂数学问题解答

逻辑推理

使用案例

教育

数学题目解答

帮助学生理解并解答复杂的数学题目

在MathVista等基准测试上达到71.0分

研究

多模态推理研究

为多模态推理领域提供基准模型

在多个数学推理基准上超越同类模型

🚀 MMR1：推进多模态推理的前沿

MMR1是一个专注于多模态推理的项目，其推出的MMR1 - Math - v0模型在数学多模态任务上表现出色，仅用6k数据就达到了SOTA水平，为多模态推理领域带来了新的突破。

🚀 快速开始

from transformers import Qwen2_5_VLForConditionalGeneration, AutoTokenizer, AutoProcessor
from qwen_vl_utils import process_vision_info
# default: Load the model on the available device(s)
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
    "MMR1/MMR1-Math-v0-7B", 
    torch_dtype=torch.bfloat16,
    attn_implementation="flash_attention_2",
    device_map="auto",
)
# default processer
processor = AutoProcessor.from_pretrained("MMR1/MMR1-Math-v0-7B")
# Example input
messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "image",
                "image": "path/to/image.jpeg",
            },
            {"type": "text", "text": "Describe this image."},
        ],
    }
]
# Preparation for inference
text = processor.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
image_inputs, video_inputs = process_vision_info(messages)
inputs = processor(
    text=[text],
    images=image_inputs,
    videos=video_inputs,
    padding=True,
    return_tensors="pt",
)
inputs = inputs.to("cuda")
# Inference: Generation of the output
generated_ids = model.generate(**inputs, max_new_tokens=128)
generated_ids_trimmed = [
    out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_text = processor.batch_decode(
    generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
)
print(output_text)

批量推理

# Sample messages for batch inference
messages1 = [
    {
        "role": "user",
        "content": [
            {"type": "image", "image": "file:///path/to/image1.jpg"},
            {"type": "image", "image": "file:///path/to/image2.jpg"},
            {"type": "text", "text": "What are the common elements in these pictures?"},
        ],
    }
]
messages2 = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Who are you?"},
]
# Combine messages for batch processing
messages = [messages1, messages2]
# Preparation for batch inference
texts = [
    processor.apply_chat_template(msg, tokenize=False, add_generation_prompt=True)
    for msg in messages
]
image_inputs, video_inputs = process_vision_info(messages)
inputs = processor(
    text=texts,
    images=image_inputs,
    videos=video_inputs,
    padding=True,
    return_tensors="pt",
)
inputs = inputs.to("cuda")
# Batch Inference
generated_ids = model.generate(**inputs, max_new_tokens=128)
generated_ids_trimmed = [
    out_ids[len(in_ids) :] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_texts = processor.batch_decode(
    generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
)
print(output_texts)

✨ 主要特性

模型亮点

SOTA性能：在开源7B模型的数学相关多模态任务中，树立了新的最优基准。
少量训练数据：仅使用来自公共训练数据集的6k高质量样本，就取得了顶级性能。
GRPO高效训练：使用64个H100进行6小时的强化学习训练，共15个epoch。
公开且高质量的数据：数据来源于公开数据集，经过严格筛选，在难度和数学问题类型上保持平衡。
平衡的数据策略：基于任务难度（过滤过于简单的问题）和数学推理多样性进行均匀的数据采样。

📚 详细文档

📰 新闻

[2025.03.11] 🔥🔥 发布MMR1 - Math - v0，仅用6k数据就达到了SOTA水平！

链接

代码：https://github.com/LengSicong/MMR1
该模型在论文 LMM - R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two - Stage Rule - Based RL 中展示，代码可在 https://github.com/LengSicong/MMR1 找到。

模型描述

MMR1 - Math - v0 - 7B是一个专门用于数学任务的大型多模态模型。值得注意的是，MMR1 - Math - v0 - 7B在开源7B多模态模型中达到了最优性能，即使与参数规模大得多的专有模型竞争也不落下风，而这一切仅通过精心策划的6k数据实例训练而成。

评估结果

我们使用 VLMEvalKit 在四个数学推理基准测试上评估了我们的模型：MathVista_MINI、MathVision、LogicVista和MathVerse_MINI。

我们还包含了MathVerse_MINI_Vision_Only_cot (MathVerse_V) 子集的结果，以与VLMEvalKit排行榜保持一致。下表将我们模型的性能与各种开源和专有模型进行了比较。

模型	规模	MathVista	MathVision	LogicVista	MathVerse	MathVerse_V
闭源模型
[GPT - 4o 1120](https://openai.com/index/gpt - 4o - system - card/)	-	60.0	31.2	52.8	40.6	-
Gemini - 2.0 - flash	-	70.4	43.6	52.3	47.8	-
[Claude3.7 - Sonnet](https://www.anthropic.com/news/claude - 3 - 7 - sonnet)	-	66.8	41.9	58.2	46.7	-
与R1相关的模型
[LLaVA - CoT](https://github.com/PKU - YuanGroup/LLaVA - CoT)	11B	52.5	19.9	39.6	22.6	-
[Open - R1 - Multimodal](https://github.com/EvolvingLMMs - Lab/open - r1 - multimodal)	7B	60.6	-	-	-	-
Mulberry	7B	63.1	-	-	-	-
LMM - R1	3B	63.2	26.4	-	-	41.6
[R1 - Onevision](https://github.com/Fancy - MLLM/R1 - Onevision?tab=readme - ov - file)	7B	-	26.2	-	-	44.1
[MM - Eureka](https://github.com/ModalMinds/MM - EUREKA)	8B	67.1	22.2	-	-	40.4
[MM - Eureka](https://github.com/ModalMinds/MM - EUREKA)	38B	64.2	26.6	-	-	48.9
开源模型
[Ovis2 - 8b](https://github.com/AIDC - AI/Ovis)	8B	71.8	25.9	39.4	42.3	-
[MiniCPM - o - 2.6](https://github.com/OpenBMB/MiniCPM - o)	8B	71.9	21.7	36.0	35.0	-
[Qwen2.5 - VL](https://github.com/QwenLM/Qwen2.5 - VL) (官方)	7B	68.2	25.4	47.9	41.1	-
[Qwen2.5 - VL](https://github.com/QwenLM/Qwen2.5 - VL) (复现)	7B	67.5	25.6	46.8	42.5	46.9
我们的模型
MMR1 - math - v0	7B	71.0	30.2	50.8	45.1	49.8

📄 许可证

本项目采用apache - 2.0许可证。

📚 引用

如果您发现MMR1对您的研究和应用有用，请使用以下BibTeX进行引用：

@misc{MMR1-Math2025,
  title={MMR1: Advancing the Frontiers of Multimodal Reasoning},
  author={Sicong Leng*, Jing Wang*, Jiaxi Li*, Hao Zhang*, Zhiqiang Hu, Boqiang Zhang, Hang Zhang, Yuming Jiang, Xin Li, Fan Wang, Yu Rong, Aixin Sun†, Shijian Lu†},
  year={2025},
  howpublished={\url{https://github.com/LengSicong/MMR1}},
}