B

Bitnet B1.58 2B 4T Bf16

Developed by microsoft
An open-source native 1-bit large language model developed by Microsoft Research, with 2 billion parameters trained on a 4 trillion token corpus, significantly improving computational efficiency.
Downloads 2,968
Release Time : 4/15/2025

Model Overview

The first open-source native 1-bit large language model with 2 billion parameters, demonstrating that native 1-bit LLMs can achieve comparable performance to full-precision counterparts while significantly improving computational efficiency (memory, energy, latency).

Model Features

Native 1.58-bit quantization
Weights are quantized to ternary values {-1, 0, +1} via absolute mean during forward propagation, while activations are quantized to 8-bit integers via absolute maximum.
Efficient computation
Significantly improves computational efficiency (memory, energy, latency), with only 0.4GB memory usage, 29ms latency (CPU decoding), and estimated energy consumption of 0.028J.
Large-scale training
Trained on a 4 trillion token corpus, proving the feasibility of native 1-bit large language models.
Optimized architecture
Features BitLinear layers, rotary position embeddings (RoPE), squared ReLU (ReLU²) activation, and subln normalization, with no bias terms in linear or normalization layers.

Model Capabilities

Text generation
Chat
Instruction following
Mathematical reasoning
Common-sense QA

Use Cases

Dialogue systems
AI assistant
Engages in natural language conversations as a helpful AI assistant.
Scored 38.40 in human evaluations
Education
Math problem solving
Solves math problems, such as those in the GSM8K dataset.
GSM8K score of 58.38
Knowledge QA
Common-sense QA
Common-sense QA score of 71.58
Domain-specific QA
MMLU score of 53.17
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase