B

Bitnet B1.58 2B 4T

Developed by microsoft
The first open-source 2-billion-parameter native 1-bit large language model developed by Microsoft Research, trained on 4 trillion tokens, demonstrating that native 1-bit large language models can significantly improve computational efficiency while maintaining performance comparable to full-precision open-source models of the same scale.
Downloads 35.87k
Release Time : 4/15/2025

Model Overview

BitNet b1.58 2B4T is a native 1.58-bit large language model that uses ternary weights {-1, 0, +1} and 8-bit activations, designed for efficient computation. While maintaining performance comparable to full-precision models of the same scale, it significantly reduces memory usage and energy consumption.

Model Features

Native 1.58-bit quantization
The model directly uses 1.58-bit weights and 8-bit activations from scratch, rather than post-training quantization.
Efficient computation
Significantly reduces memory usage, energy consumption, and latency compared to full-precision models of the same scale.
Large-scale training
Trained on 4 trillion tokens of diverse corpus, including text, code, and mathematical data.
Optimized architecture
Utilizes optimization techniques such as rotary position embedding, squared ReLU activation, and subln normalization.

Model Capabilities

Text generation
Dialogue systems
Instruction following
Code generation
Mathematical reasoning

Use Cases

Dialogue systems
AI assistant
Build high-performance, low-resource-consuming dialogue assistants
Achieved a score of 38.4 in human evaluation
Mathematical reasoning
Math problem solving
Solve GSM8K math problems
Achieved 58.38% accuracy
Commonsense reasoning
Commonsense QA
Answer commonsense questions
Achieved 71.58 points on Commonsense QA
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase