BitNet b1.58 is a 1.58-bit quantized large language model that achieves efficient inference by quantizing weights to ternary values {-1, 0, 1}. The model reproduces the original paper's results and was trained on 100 billion tokens from the RedPajama dataset.
Large Language Model
Transformers