MiniCPM3-4B Open-Source AI Model - Outperforming Phi-3.5, etc., Comparable to Models in the 7B

Minicpm3 4B

Developed by openbmb

MiniCPM3-4B is the third-generation model in the MiniCPM series, with overall performance surpassing Phi-3.5-mini-Instruct and GPT-3.5-Turbo-0125, comparable to several recent 7B~9B-scale models.

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Apache-2.0 #Bilingual Optimization (Chinese-English)#Function Call Support #Long Text Processing

Downloads 15.94k

Release Time : 9/3/2024

Model Overview

MiniCPM3-4B possesses enhanced multi-dimensional capabilities for general-purpose applications, supporting function calls and code interpreter functionality with a 32k context window. Combined with LLMxMapReduce technology, it can theoretically process infinitely long texts without excessive memory consumption.

Model Features

High Performance

Overall performance surpasses Phi-3.5-mini-Instruct and GPT-3.5-Turbo-0125, comparable to several recent 7B~9B-scale models.

Multi-functional Support

Supports function calls and code interpreter functionality, equipped with enhanced multi-dimensional capabilities for general-purpose applications.

Long Text Processing

Features a 32k context window and utilizes LLMxMapReduce technology to theoretically process infinitely long texts without excessive memory consumption.

Model Capabilities

Text Generation

Function Calls

Code Interpretation

Long Text Processing

Multilingual Support

Use Cases

General Q&A

Travel Recommendations

Recommend tourist attractions for users

Capable of generating attraction recommendation lists tailored to user needs

Programming Assistance

Code Generation

Generate code snippets based on requirements

Achieved 68.3 points on HumanEval+ tests

Math Problem Solving

Mathematical Calculations

Solve various math problems

Scored 81.1 points on GSM8K tests

license: apache-2.0 language:

zh
en pipeline_tag: text-generation library_name: transformers

MiniCPM Repo | MiniCPM Paper | MiniCPM-V Repo | Join us in Discord and WeChat

Introduction

MiniCPM3-4B is the 3rd generation of MiniCPM series. The overall performance of MiniCPM3-4B surpasses Phi-3.5-mini-Instruct and GPT-3.5-Turbo-0125, being comparable with many recent 7B~9B models.

Compared to MiniCPM1.0/MiniCPM2.0, MiniCPM3-4B has a more powerful and versatile skill set to enable more general usage. MiniCPM3-4B supports function call, along with code interpreter. Please refer to Advanced Features for usage guidelines.

MiniCPM3-4B has a 32k context window. Equipped with LLMxMapReduce, MiniCPM3-4B can handle infinite context theoretically, without requiring huge amount of memory.

Usage

Inference with Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

path = "openbmb/MiniCPM3-4B"
device = "cuda"

tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(path, torch_dtype=torch.bfloat16, device_map=device, trust_remote_code=True)

messages = [
    {"role": "user", "content": "推荐5个北京的景点。"},
]
model_inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(device)

model_outputs = model.generate(
    model_inputs,
    max_new_tokens=1024,
    top_p=0.7,
    temperature=0.7
)

output_token_ids = [
    model_outputs[i][len(model_inputs[i]):] for i in range(len(model_inputs))
]

responses = tokenizer.batch_decode(output_token_ids, skip_special_tokens=True)[0]
print(responses)

Inference with vLLM

For now, you need to install our forked version of vLLM.

pip install git+https://github.com/OpenBMB/vllm.git@minicpm3

from transformers import AutoTokenizer
from vllm import LLM, SamplingParams

model_name = "openbmb/MiniCPM3-4B"
prompt = [{"role": "user", "content": "推荐5个北京的景点。"}]

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
input_text = tokenizer.apply_chat_template(prompt, tokenize=False, add_generation_prompt=True)

llm = LLM(
    model=model_name,
    trust_remote_code=True,
    tensor_parallel_size=1
)
sampling_params = SamplingParams(top_p=0.7, temperature=0.7, max_tokens=1024, repetition_penalty=1.02)

outputs = llm.generate(prompts=input_text, sampling_params=sampling_params)

print(outputs[0].outputs[0].text)

Evaluation Results

Benchmark	Qwen2-7B-Instruct	GLM-4-9B-Chat	Gemma2-9B-it	Llama3.1-8B-Instruct	GPT-3.5-Turbo-0125	Phi-3.5-mini-Instruct(3.8B)	MiniCPM3-4B
English
MMLU	70.5	72.4	72.6	69.4	69.2	68.4	67.2
BBH	64.9	76.3	65.2	67.8	70.3	68.6	70.2
MT-Bench	8.41	8.35	7.88	8.28	8.17	8.60	8.41
IFEVAL (Prompt Strict-Acc.)	51.0	64.5	71.9	71.5	58.8	49.4	68.4
Chinese
CMMLU	80.9	71.5	59.5	55.8	54.5	46.9	73.3
CEVAL	77.2	75.6	56.7	55.2	52.8	46.1	73.6
AlignBench v1.1	7.10	6.61	7.10	5.68	5.82	5.73	6.74
FollowBench-zh (SSR)	63.0	56.4	57.0	50.6	64.6	58.1	66.8
Math
MATH	49.6	50.6	46.0	51.9	41.8	46.4	46.6
GSM8K	82.3	79.6	79.7	84.5	76.4	82.7	81.1
MathBench	63.4	59.4	45.8	54.3	48.9	54.9	65.6
Code
HumanEval+	70.1	67.1	61.6	62.8	66.5	68.9	68.3
MBPP+	57.1	62.2	64.3	55.3	71.4	55.8	63.2
LiveCodeBench v3	22.2	20.2	19.2	20.4	24.0	19.6	22.6
Function Call
BFCL v2	71.6	70.1	19.2	73.3	75.4	48.4	76.0
Overall
Average	65.3	65.0	57.9	60.8	61.0	57.2	66.3

Statement

As a language model, MiniCPM3-4B generates content by learning from a vast amount of text.
However, it does not possess the ability to comprehend or express personal opinions or value judgments.
Any content generated by MiniCPM3-4B does not represent the viewpoints or positions of the model developers.
Therefore, when using content generated by MiniCPM3-4B, users should take full responsibility for evaluating and verifying it on their own.

LICENSE

This repository is released under the Apache-2.0 License.
The usage of MiniCPM3-4B model weights must strictly follow MiniCPM Model License.md.
The models and weights of MiniCPM3-4B are completely free for academic research. after filling out a "questionnaire" for registration, are also available for free commercial use.

Citation

@article{hu2024minicpm,
  title={MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies},
  author={Hu, Shengding and Tu, Yuge and Han, Xu and He, Chaoqun and Cui, Ganqu and Long, Xiang and Zheng, Zhi and Fang, Yewei and Huang, Yuxiang and Zhao, Weilin and others},
  journal={arXiv preprint arXiv:2404.06395},
  year={2024}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご