Bitnet B1 58 Large
BitNet b1.58 is a 1-bit large language model with 3 billion parameters, trained on the RedPajama dataset for 100 billion tokens.
Downloads 10.17k
Release Time : 3/29/2024
Model Overview
This model is a 1-bit quantized large language model designed to provide efficient inference performance while maintaining accuracy comparable to traditional floating-point models.
Model Features
1-bit quantization
Model weights and activations are represented using only 1 bit, significantly reducing memory usage and computational requirements.
Efficient inference
Compared to traditional floating-point models, 1-bit quantization significantly improves inference efficiency.
Performance retention
Achieves quantization while maintaining model performance close to full-precision models.
Two-phase training
Trained using the two-phase learning rate and weight decay strategy suggested in the paper.
Model Capabilities
Text generation
Language understanding
Zero-shot learning
Use Cases
Natural Language Processing
Question answering systems
Can be used to build efficient question answering systems
Performs well on benchmarks like ARC
Text generation
Can be used for various text generation tasks
Perplexity metrics are close to full-precision models
Featured Recommended AI Models
Š 2025AIbase