M

Minicpm S 1B Sft

Developed by openbmb
MiniCPM-S-1B-sft is a 1B-parameter language model optimized with activation sparsity techniques, achieving high-sparsity inference acceleration through the ProSparse method while maintaining performance comparable to the original model.
Downloads 169
Release Time : 4/25/2025

Model Overview

This model employs the ProSparse training method, replacing FFNs' activation functions with ReLU and applying progressive sparse regularization, ultimately achieving up to 87.89% sparsity. Suitable for scenarios requiring efficient inference.

Model Features

High Activation Sparsity
Achieves 87.89% sparsity through the ProSparse method, significantly higher than other ReLU-activated models
Efficient Inference Acceleration
High sparsity combined with specialized sparse GPU operators enables significant inference acceleration under the PowerInfer framework
Performance Retention
Maintains performance comparable to the original Swish-activated model while achieving sparsity
Progressive Sparse Training
Adopts a three-phase training strategy: activation function replacement, progressive sparse regularization, and activation threshold shifting

Model Capabilities

Text generation
Commonsense reasoning
Code generation
Reading comprehension
Mathematical problem solving
Knowledge QA

Use Cases

Efficient Inference Applications
Edge Device Deployment
Leverages high sparsity for efficient inference on resource-constrained devices
Achieves significant acceleration under the PowerInfer framework
Real-time Dialogue Systems
Suitable for chatbot scenarios requiring low-latency responses
Educational Applications
Programming Learning Assistance
Helps students understand and generate code
HumanEval score 42.04, MBPP score 41.38
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase