đ MiniCPM4-MCP
MiniCPM4-MCP is an open - source on - device LLM agent model. Built on MiniCPM - 4, it can solve a wide range of real - world tasks by interacting with various tool and data resources through MCP.
đ Quick Start
You can start exploring MiniCPM4-MCP by referring to the GitHub Repo. Also, check out the Technical Report for in - depth information.
GitHub Repo |
Technical Report
đ Join us on Discord and WeChat
⨠Features
What's New
- [2025.06.06] MiniCPM4 series are released! This model achieves ultimate efficiency improvements while maintaining optimal performance at the same scale! It can achieve over 5x generation acceleration on typical end - side chips! You can find technical report here.đĨđĨđĨ
MiniCPM4 Series
MiniCPM4 series are highly efficient large language models (LLMs) designed explicitly for end - side devices, which achieves this efficiency through systematic innovation in four key dimensions: model architecture, training data, training algorithms, and inference systems.
- [MiniCPM4 - 8B](https://huggingface.co/openbmb/MiniCPM4 - 8B): The flagship of MiniCPM4, with 8B parameters, trained on 8T tokens.
- [MiniCPM4 - 0.5B](https://huggingface.co/openbmb/MiniCPM4 - 0.5B): The small version of MiniCPM4, with 0.5B parameters, trained on 1T tokens.
- [MiniCPM4 - 8B - Eagle - FRSpec](https://huggingface.co/openbmb/MiniCPM4 - 8B - Eagle - FRSpec): Eagle head for FRSpec, accelerating speculative inference for MiniCPM4 - 8B.
- [MiniCPM4 - 8B - Eagle - FRSpec - QAT - cpmcu](https://huggingface.co/openbmb/MiniCPM4 - 8B - Eagle - FRSpec - QAT - cpmcu): Eagle head trained with QAT for FRSpec, efficiently integrate speculation and quantization to achieve ultra acceleration for MiniCPM4 - 8B.
- [MiniCPM4 - 8B - Eagle - vLLM](https://huggingface.co/openbmb/MiniCPM4 - 8B - Eagle - vLLM): Eagle head in vLLM format, accelerating speculative inference for MiniCPM4 - 8B.
- [MiniCPM4 - 8B - marlin - Eagle - vLLM](https://huggingface.co/openbmb/MiniCPM4 - 8B - marlin - Eagle - vLLM): Quantized Eagle head for vLLM format, accelerating speculative inference for MiniCPM4 - 8B.
- [BitCPM4 - 0.5B](https://huggingface.co/openbmb/BitCPM4 - 0.5B): Extreme ternary quantization applied to MiniCPM4 - 0.5B compresses model parameters into ternary values, achieving a 90% reduction in bit width.
- [BitCPM4 - 1B](https://huggingface.co/openbmb/BitCPM4 - 1B): Extreme ternary quantization applied to MiniCPM3 - 1B compresses model parameters into ternary values, achieving a 90% reduction in bit width.
- [MiniCPM4 - Survey](https://huggingface.co/openbmb/MiniCPM4 - Survey): Based on MiniCPM4 - 8B, accepts users' quiries as input and autonomously generate trustworthy, long - form survey papers.
- [MiniCPM4 - MCP](https://huggingface.co/openbmb/MiniCPM4 - MCP): Based on MiniCPM4 - 8B, accepts users' queries and available MCP tools as input and autonomously calls relevant MCP tools to satisfy users' requirements. (<-- you are here)
Introduction
MiniCPM4 - MCP is an open - source on - device LLM agent model jointly developed by THUNLP, Renmin University of China and ModelBest, built on [MiniCPM - 4](https://huggingface.co/openbmb/MiniCPM4 - 8B) with 8 billion parameters. It is capable of solving a wide range of real - world tasks by interacting with various tool and data resources through MCP.
Usage
As of now, MiniCPM4 - MCP supports the following:
-
Utilization of tools across 16 MCP servers: These servers span various categories, including office, lifestyle, communication, information, and work management.
-
Single - tool - calling capability: It can perform single - or multi - step tool calls using a single tool that complies with the MCP.
-
Cross - tool - calling capability: It can perform single - or multi - step tool calls using different tools that complies with the MCP.
Evaluation
The detailed evaluation script can be found on the [GitHub](https://github.com/OpenBMB/MiniCPM/tree/minicpm - 4/demo/minicpm4/MCP) page. The evaluation results are presented below.
MCP Server |
|
gpt - 4o |
|
|
qwen3 |
|
|
minicpm4 |
|
|
func |
param |
value |
func |
param |
value |
func |
param |
value |
Airbnb |
89.3 |
67.9 |
53.6 |
92.8 |
60.7 |
50.0 |
96.4 |
67.9 |
50.0 |
Amap - Maps |
79.8 |
77.5 |
50.0 |
74.4 |
72.0 |
41.0 |
89.3 |
85.7 |
39.9 |
Arxiv - MCP - Server |
85.7 |
85.7 |
85.7 |
81.8 |
54.5 |
50.0 |
57.1 |
57.1 |
52.4 |
Calculator |
100.0 |
100.0 |
20.0 |
80.0 |
80.0 |
13.3 |
100.0 |
100.0 |
6.67 |
Computor - Control - MCP |
90.0 |
90.0 |
90.0 |
90.0 |
90.0 |
90.0 |
90.0 |
90.0 |
86.7 |
Desktop - Commander |
100.0 |
100.0 |
100.0 |
100.0 |
100.0 |
100.0 |
100.0 |
100.0 |
100.0 |
Filesystem |
63.5 |
63.5 |
31.3 |
69.7 |
69.7 |
26.0 |
83.3 |
83.3 |
42.7 |
Github |
92.0 |
80.0 |
58.0 |
80.5 |
50.0 |
27.7 |
62.8 |
25.7 |
17.1 |
Gaode |
71.1 |
55.6 |
17.8 |
68.8 |
46.6 |
24.4 |
68.9 |
46.7 |
15.6 |
MCP - Code - Executor |
85.0 |
80.0 |
70.0 |
80.0 |
80.0 |
70.0 |
90.0 |
90.0 |
65.0 |
MCP - Docx |
95.8 |
86.7 |
67.1 |
94.9 |
81.6 |
60.1 |
95.1 |
86.6 |
76.1 |
PPT |
72.6 |
49.8 |
40.9 |
85.9 |
50.7 |
37.5 |
91.2 |
72.1 |
56.7 |
PPTx |
64.2 |
53.7 |
13.4 |
91.0 |
68.6 |
20.9 |
91.0 |
58.2 |
26.9 |
Simple - Time - Server |
90.0 |
70.0 |
70.0 |
90.0 |
90.0 |
90.0 |
90.0 |
60.0 |
60.0 |
Slack |
100.0 |
90.0 |
70.0 |
100.0 |
100.0 |
65.0 |
100.0 |
100.0 |
100.0 |
Whisper |
90.0 |
90.0 |
90.0 |
90.0 |
90.0 |
90.0 |
90.0 |
90.0 |
30.0 |
Average |
80.2 |
70.2 |
49.1 |
83.5 |
67.7 |
43.8 |
88.3 |
76.1 |
51.2 |
Statement
- As a language model, MiniCPM generates content by learning from a vast amount of text.
- However, it does not possess the ability to comprehend or express personal opinions or value judgments.
- Any content generated by MiniCPM does not represent the viewpoints or positions of the model developers.
- Therefore, when using content generated by MiniCPM, users should take full responsibility for evaluating and verifying it on their own.
đ License
- This repository and MiniCPM models are released under the Apache - 2.0 License.
đ Documentation
Citation
- Please cite our paper if you find our work valuable.
@article{minicpm4,
title={{MiniCPM4}: Ultra - Efficient LLMs on End Devices},
author={MiniCPM Team},
year={2025}
}