Falcon-E-1B-Base Open-Source Language Model - Optimized for Edge Devices, Efficient and Practical!

Falcon E 1B Base

Developed by tiiuae

Falcon-E-1B-Base is an efficient 1.58-bit language model developed by TII, featuring a pure Transformer architecture and optimized for edge devices.

Large Language Model

Transformers

Open Source License:Other #1.58-bit quantization #Low memory footprint #English NLP

Downloads 53

Release Time : 4/10/2025

Model Overview

This is a causal decoder-only base version language model utilizing 1.58-bit quantization technology, significantly reducing memory usage while maintaining good performance.

Model Features

Efficient Quantization

Utilizes 1.58-bit quantization technology to significantly reduce model memory footprint

Edge Optimization

Designed specifically for edge devices with extremely low memory usage

Multi-version Support

Offers three variants: BitNet model, pre-quantized checkpoints, and bfloat16 version

Model Capabilities

English text generation

Instruction following

Knowledge Q&A

Use Cases

Edge Computing

Mobile Smart Assistant

Deploy efficient text generation on resource-constrained mobile devices

635MB memory usage, suitable for mobile devices

Research

Efficient Model Research

Study the impact of low-bit quantization on model performance

Performs well across multiple benchmark tests

🚀 Falcon-E Model

Falcon-E is a series of powerful, universal, and fine-tunable 1.58bit language models developed by tiiuae, offering high performance in various NLP tasks with different model scales.

🚀 Quick Start

To use the Falcon-E model, you can rely on the Hugging Face transformers library or the BitNet library. There are multiple ways to interact with the model depending on your target usage.

✨ Features

Model Type: Causal decoder-only / Base version
Architecture: Pure-transformer - 1.58bit version
Language(s) (NLP): English
License: Falcon-LLM License

📦 Installation

Currently, to use this model, you can either rely on the Hugging Face transformers library or the BitNet library. For the BitNet library, you can install it as follows:

git clone https://github.com/microsoft/BitNet && cd BitNet
pip install -r requirements.txt

💻 Usage Examples

Basic Usage

🤗 transformers

If you want to perform inference on the BitNet checkpoint, run:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "tiiuae/Falcon-E-1B-Base"

model = AutoModelForCausalLM.from_pretrained(
  model_id,
  torch_dtype=torch.bfloat16,
).to("cuda")

# Perform text generation

If you want to use the classic bfloat16 version, you can run:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "tiiuae/Falcon-E-1B-Base"
revision = "bfloat16"

model = AutoModelForCausalLM.from_pretrained(
  model_id,
  torch_dtype=torch.bfloat16,
  revision=revision,
).to("cuda")

# Perform text generation

BitNet

python setup_env.py --hf-repo tiiuae/Falcon-E-1B-Base -q i2_s
python run_inference.py -m models/Falcon-E-1B-Base/ggml-model-i2_s.gguf -p "You are a helpful assistant" -cnv

Advanced Usage

Fine-tuning

For fine-tuning the model, you should load the prequantized revision of the model and use the onebitllms Python package:

import torch

from transformers import AutoModelForCausalLM, AutoTokenizer
from trl import SFTTrainer
from onebitllms import replace_linear_with_bitnet_linear, quantize_to_1bit

model_id = "tiiuae/Falcon-E-1B-Base"

tokenizer = AutoTokenizer.from_pretrained(model_id, revision="prequantized")
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    revision="prequantized"
)
model = replace_linear_with_bitnet_linear(model)

trainer = SFTTrainer(
    model,
    ...
)

trainer.train()

quantize_to_1bit(output_directory)

📚 Documentation

Model Details

Property	Details
Developed by	https://www.tii.ae
Model Type	Causal decoder-only / Base version
Architecture	Pure-transformer - 1.58bit version
Language(s) (NLP)	English
License	Falcon-LLM License

Training Details

For more details about the training protocol of this model, please refer to the Falcon-E technical blogpost.

Evaluation

We report in the following table our internal pipeline benchmarks. Note that evaluation results are normalized scores from former Hugging Face leaderboard v2 tasks.

For 1B scale models and below

Model	Nb Params	Mem Footprint	IFEVAL	Math-Hard	GPQA	MuSR	BBH	MMLU-Pro	Avg.
Qwen-2.5-0.5B	0.5B	1GB	16.27	3.93	0.0	2.08	6.95	10.06	6.55
SmolLM2-360M	0.36B	720MB	21.15	1.21	0.0	7.73	5.54	1.88	6.25
Qwen-2.5-1.5B	1.5B	3.1GB	26.74	9.14	16.66	5.27	20.61	4.7	13.85
Llama-3.2-1B	1.24B	2.47GB	14.78	1.21	4.37	2.56	2.26	0	4.2
SmolLM2-1.7B	1.7B	3.4GB	24.4	2.64	9.3	4.6	12.64	3.91	9.58
Falcon-3-1B-Base	1.5B	3GB	24.28	3.32	11.34	9.71	6.76	3.91	9.89
Hymba-1.5B-Base	1.5B	3GB	22.95	1.36	7.69	5.18	10.25	0.78	8.04
Falcon-E-1B-Base	1.8B	635MB	32.9	10.97	2.8	3.65	12.28	17.82	13.40

For 3B scale models

Model	Nb Params	Mem Footprint	IFEVAL	Math-Hard	GPQA	MuSR	BBH	MMLU-Pro	Avg.
Falcon-3-3B-Base	3B	6.46GB	15.74	11.78	21.58	6.27	18.09	6.26	15.74
Qwen2.5-3B	3B	6.17GB	26.9	14.8	24.3	11.76	24.48	6.38	18.1
Falcon-E-3B-Base	3B	999MB	36.67	13.45	8.67	4.14	19.83	27.16	18.32

Instruction fine-tuned models - For 1B scale models and below

Model	Nb Params	Mem Footprint	IFEVAL	Math-Hard	GPQA	MuSR	BBH	MMLU-Pro	Avg.
Qwen-2.5-0.5B-Instruct	500M	1GB	30.71	0	8.43	0.94	7.75	0	6.59
SmolLM2-360M-Instruct	360M	720MB	38.42	1.51	4.17	2.77	1.3	0.67	8.14
Qwen-2.5-1.5B-Instruct	1.5B	3.1GB	44.76	22.05	19.81	3.19	19.99	0.78	18.43
SmolLM2-1.7B	1.7B	3.4GB	53.68	5.82	10.92	4.1	11.71	0	15.02
Falcon-3-1B-Instruct	1.5B	3GB	55.57	6.34	12.96	10.56	9.32	2.24	16.16
Hymba-1.5B-Instruct	1.5B	3GB	60.09	2.72	4.59	1.05	11.56	5.515	14.19
Falcon-E-1B-Instruct	1.8B	635MB	54.35	9.12	16.5	2.51	19.42	9.64	18.59

Instruction fine-tuned models - For 3B scale models

Model	Nb Params	Mem Footprint	IFEVAL	Math-Hard	GPQA	MuSR	BBH	MMLU-Pro	Avg.
Falcon-3-3B-Instruct	3B	6.46GB	69.77	25	26.29	11.13	22.28	5.15	26.6
Qwen2.5-3B-Instruct	3B	6.17GB	64.75	36.78	25.8	7.57	25.05	3.02	27.16
Falcon-E-3B-Instruct	3B	999MB	60.97	15.3	23.59	2.12	26.45	7.45	22.64666667

Useful links

View our release blogpost.
Learn more about onebitllms library.
Feel free to join our discord server if you have any questions or to interact with our researchers and developers.

📄 License

This model is under the Falcon-LLM License.

🔧 Technical Details

For more technical details, please refer to the Falcon-E technical blogpost.

📄 Citation

If the Falcon-E family of models were helpful to your work, feel free to give us a cite.

@misc{tiionebitllms,
    title = {Falcon-E, a series of powerful, universal and fine-tunable 1.58bit language models.},
    author = {Falcon-LLM Team},
    month = {April},
    url = {https://falcon-lm.github.io/blog/falcon-edge},
    year = {2025}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご