P

Prosparse Llama 2 7b

Developed by SparseLLM
A large language model based on LLaMA-2-7B with activation sparsification, achieving high sparsity (89.32%) while maintaining original performance through the ProSparse method
Downloads 152
Release Time : 2/19/2024

Model Overview

A ReLU-activated LLaMA-2 variant trained with progressive sparse regularization, significantly improving inference efficiency, suitable for text generation and comprehension tasks

Model Features

High Activation Sparsity
Achieves 89.32% sparsity through the ProSparse method, significantly higher than similar ReLU models (e.g., ReluLLaMA-7B's 66.98%)
Performance Preservation
Maintains task performance comparable to the original Swish-activated LLaMA-2 while achieving sparsification
Inference Acceleration
High sparsity supports PowerInfer framework and custom GPU operators, achieving 1.27-2.17x speedup in tests
Progressive Training
Three-phase training process: activation replacement → progressive regularization → threshold shifting, effectively balancing sparsity and performance

Model Capabilities

Text Generation
Code Generation
Common-sense Reasoning
Reading Comprehension
Mathematical Reasoning

Use Cases

Efficient Inference
Edge Device Deployment
Leverages high sparsity for efficient inference on resource-constrained devices
Achieves 218.3 tokens/s on a single A100 GPU with PowerInfer framework
Academic Research
Sparsification Method Validation
Serves as a benchmark model for activation sparsification research
Currently the sparsest activation model among open-source LLaMA models of the same size
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase