MiniCPM3-4B開源AI模型 - 性能超Phi-3.5等，媲美7B

首頁

Minicpm3 4B

由openbmb開發

MiniCPM3-4B是MiniCPM系列第三代模型，整體性能超越Phi-3.5-mini-Instruct和GPT-3.5-Turbo-0125，與近期多個7B~9B量級模型表現相當。

大型語言模型

Transformers

支持多種語言開源協議:Apache-2.0 #中英雙語優化 #函數調用支持 #長文本處理

下載量 15.94k

發布時間 : 9/3/2024

模型概述

MiniCPM3-4B具備更強大的多維度能力以實現通用場景應用，支持函數調用及代碼解釋器功能，具有32k上下文窗口。結合LLMxMapReduce技術，理論上可處理無限長文本且無需消耗大量內存。

模型特點

高性能

整體性能超越Phi-3.5-mini-Instruct和GPT-3.5-Turbo-0125，與近期多個7B~9B量級模型表現相當。

多功能支持

支持函數調用及代碼解釋器功能，具備更強大的多維度能力以實現通用場景應用。

長文本處理

具有32k上下文窗口，結合LLMxMapReduce技術，理論上可處理無限長文本且無需消耗大量內存。

模型能力

文本生成

函數調用

代碼解釋

長文本處理

多語言支持

使用案例

通用問答

旅遊推薦

為用戶推薦旅遊景點

能夠生成符合用戶需求的景點推薦列表

編程輔助

代碼生成

根據需求生成代碼片段

在HumanEval+測試中獲得68.3分

數學問題解答

數學計算

解決各類數學問題

在GSM8K測試中獲得81.1分

🚀 MiniCPM3-4B

MiniCPM3-4B 是 MiniCPM 系列的第三代模型。其整體性能超越了 Phi-3.5-mini-Instruct 和 GPT-3.5-Turbo-0125，可與許多近期發佈的 7B - 9B 模型相媲美。

MiniCPM 倉庫 | MiniCPM 論文 | MiniCPM-V 倉庫 | 加入我們的 Discord 和微信社群

🚀 快速開始

模型簡介

MiniCPM3-4B 是 MiniCPM 系列的第三代模型。與 MiniCPM1.0/MiniCPM2.0 相比，MiniCPM3-4B 具備更強大、更多樣的能力，支持函數調用和代碼解釋器。它擁有 32k 的上下文窗口，藉助 LLMxMapReduce，理論上可以處理無限上下文，而無需大量內存。

推理示例

使用 Transformers 進行推理

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

path = "openbmb/MiniCPM3-4B"
device = "cuda"

tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(path, torch_dtype=torch.bfloat16, device_map=device, trust_remote_code=True)

messages = [
    {"role": "user", "content": "推薦5個北京的景點。"},
]
model_inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(device)

model_outputs = model.generate(
    model_inputs,
    max_new_tokens=1024,
    top_p=0.7,
    temperature=0.7
)

output_token_ids = [
    model_outputs[i][len(model_inputs[i]):] for i in range(len(model_inputs))
]

responses = tokenizer.batch_decode(output_token_ids, skip_special_tokens=True)[0]
print(responses)

使用 vLLM 進行推理

首先，你需要安裝我們分叉的 vLLM 版本：

pip install git+https://github.com/OpenBMB/vllm.git@minicpm3

from transformers import AutoTokenizer
from vllm import LLM, SamplingParams

model_name = "openbmb/MiniCPM3-4B"
prompt = [{"role": "user", "content": "推薦5個北京的景點。"}]

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
input_text = tokenizer.apply_chat_template(prompt, tokenize=False, add_generation_prompt=True)

llm = LLM(
    model=model_name,
    trust_remote_code=True,
    tensor_parallel_size=1
)
sampling_params = SamplingParams(top_p=0.7, temperature=0.7, max_tokens=1024, repetition_penalty=1.02)

outputs = llm.generate(prompts=input_text, sampling_params=sampling_params)

print(outputs[0].outputs[0].text)

✨ 主要特性

性能卓越：整體性能超越 Phi-3.5-mini-Instruct 和 GPT-3.5-Turbo-0125，可與許多近期的 7B - 9B 模型相媲美。
功能豐富：支持函數調用和代碼解釋器，具備更強大、更多樣的能力。
上下文處理能力強：擁有 32k 的上下文窗口，藉助 LLMxMapReduce，理論上可以處理無限上下文，無需大量內存。

📚 詳細文檔

評估結果

基準測試	Qwen2-7B-Instruct	GLM-4-9B-Chat	Gemma2-9B-it	Llama3.1-8B-Instruct	GPT-3.5-Turbo-0125	Phi-3.5-mini-Instruct(3.8B)	MiniCPM3-4B
英文
MMLU	70.5	72.4	72.6	69.4	69.2	68.4	67.2
BBH	64.9	76.3	65.2	67.8	70.3	68.6	70.2
MT-Bench	8.41	8.35	7.88	8.28	8.17	8.60	8.41
IFEVAL (Prompt Strict-Acc.)	51.0	64.5	71.9	71.5	58.8	49.4	68.4
中文
CMMLU	80.9	71.5	59.5	55.8	54.5	46.9	73.3
CEVAL	77.2	75.6	56.7	55.2	52.8	46.1	73.6
AlignBench v1.1	7.10	6.61	7.10	5.68	5.82	5.73	6.74
FollowBench-zh (SSR)	63.0	56.4	57.0	50.6	64.6	58.1	66.8
數學
MATH	49.6	50.6	46.0	51.9	41.8	46.4	46.6
GSM8K	82.3	79.6	79.7	84.5	76.4	82.7	81.1
MathBench	63.4	59.4	45.8	54.3	48.9	54.9	65.6
代碼
HumanEval+	70.1	67.1	61.6	62.8	66.5	68.9	68.3
MBPP+	57.1	62.2	64.3	55.3	71.4	55.8	63.2
LiveCodeBench v3	22.2	20.2	19.2	20.4	24.0	19.6	22.6
函數調用
BFCL v2	71.6	70.1	19.2	73.3	75.4	48.4	76.0
整體
平均	65.3	65.0	57.9	60.8	61.0	57.2	66.3

聲明

作為一個語言模型，MiniCPM3-4B 通過學習大量文本生成內容。
然而，它不具備理解或表達個人觀點或價值判斷的能力。
MiniCPM3-4B 生成的任何內容均不代表模型開發者的觀點或立場。
因此，在使用 MiniCPM3-4B 生成的內容時，用戶應自行承擔評估和驗證的全部責任。

📄 許可證

本倉庫採用 Apache-2.0 許可證發佈。
MiniCPM3-4B 模型權重的使用必須嚴格遵循 MiniCPM 模型許可證。
MiniCPM3-4B 的模型和權重完全免費用於學術研究，填寫 "問卷" 進行註冊後，也可免費用於商業用途。

📚 引用

@article{hu2024minicpm,
  title={MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies},
  author={Hu, Shengding and Tu, Yuge and Han, Xu and He, Chaoqun and Cui, Ganqu and Long, Xiang and Zheng, Zhi and Fang, Yewei and Huang, Yuxiang and Zhao, Weilin and others},
  journal={arXiv preprint arXiv:2404.06395},
  year={2024}
}