BitNet b1.58 Open-Source Large Language Model - High-Efficiency Inference, Trained on Big Datasets

Bitnet B1 58 3B

Developed by 1bitLLM

BitNet b1.58 is a 1.58-bit quantized large language model that achieves efficient inference by quantizing weights to ternary values {-1, 0, 1}. The model reproduces the original paper's results and was trained on 100 billion tokens from the RedPajama dataset.

Large Language Model

Transformers

Open Source License:MIT #1-bit quantization #Efficient inference #Language model

Downloads 1,109

Release Time : 3/29/2024

Model Overview

BitNet b1.58 is an efficient large language model that employs 1.58-bit quantization technology, using only ternary values {-1, 0, 1} for weights, significantly reducing computational and storage requirements while maintaining performance close to full-precision models.

Model Features

1.58-bit quantization

Weights are represented using only ternary values {-1, 0, 1}, significantly reducing model storage and computational requirements

Efficient inference

Quantization design enables higher computational efficiency during model inference

Performance close to FP16

Despite heavy quantization, the model maintains performance close to full-precision (FP16) versions

Two-phase training

Adopts the paper's suggested two-phase learning rate and weight decay strategy to optimize the training process

Model Capabilities

Text generation

Language understanding

Zero-shot task processing

Use Cases

Efficient inference scenarios

Edge device deployment

Deploy large language models on resource-constrained devices using low-bit quantization features

Reduces computational and storage requirements while maintaining reasonable performance

Large-scale services

Provide efficient language model services in high-concurrency scenarios

Reduces server resource consumption

Research applications

Model quantization research

Serves as a benchmark reference for low-bit quantized large language models

Provides reproducible quantized model implementations

🚀 BitNet b1.58 Paper Reproduction

This project is a reproduction of the BitNet b1.58 paper. The models are trained on the RedPajama dataset with 100B tokens. The hyperparameters, along with two - stage learning rate and weight decay, are implemented as suggested in their paper. All models are open - source and available in the repo. We plan to train larger models or use more tokens when resources permit.

✨ Features

Reproduce the BitNet b1.58 paper.
Train models on the RedPajama dataset.
Implement hyperparameters as suggested in the paper.
Make all models open - source.

📦 Installation

The evaluation pipelines are from the paper authors. Here are the commands to set up the evaluation environment:

pip install lm-eval==0.3.0

💻 Usage Examples

Basic Usage

To evaluate the perplexity (PPL):

python eval_ppl.py --hf_path 1bitLLM/bitnet_b1_58-3B --seqlen 2048

Advanced Usage

To evaluate tasks:

python eval_task.py --hf_path 1bitLLM/bitnet_b1_58-3B \
    --batch_size 1 \
    --tasks \
    --output_path result.json \
    --num_fewshot 0 \
    --ctx_size 2048

📚 Documentation

Results

PPL and zero - shot accuracy:

Models	PPL	ARCe	ARCc	HS	BQ	OQ	PQ	WGe	Avg
FP16 700M (reported)	12.33	54.7	23.0	37.0	60.0	20.2	68.9	54.8	45.5
BitNet b1.58 700M (reported)	12.87	51.8	21.4	35.1	58.2	20.0	68.1	55.2	44.3
BitNet b1.58 700M (reproduced)	12.78	51.4	21.8	35.0	59.6	20.6	67.5	55.4	44.5
FP16 1.3B (reported)	11.25	56.9	23.5	38.5	59.1	21.6	70.0	53.9	46.2
BitNet b1.58 1.3B (reported)	11.29	54.9	24.2	37.7	56.7	19.6	68.8	55.8	45.4
BitNet b1.58 1.3B (reproduced)	11.19	55.8	23.7	37.6	59.0	20.2	69.2	56.0	45.9
FP16 3B (reported)	10.04	62.1	25.6	43.3	61.8	24.6	72.1	58.2	49.7
BitNet b1.58 3B (reported)	9.91	61.4	28.3	42.9	61.5	26.6	71.5	59.3	50.2
BitNet b1.58 3B (reproduced)	9.88	60.9	28.0	42.3	58.3	26.0	71.4	60.3	49.6

The differences between the reported numbers and the reproduced results are possibly variances from the training data processing, seeds, or other random factors.

📄 License

This project is licensed under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご