Falcon-H1-3B-Instruct Open-Source Language Model - Free Support for English and Multilingual Task Processing

Falcon H1 3B Instruct

Developed by tiiuae

Falcon-H1 is a causal decoder-only language model developed by TII with a hybrid Transformers+Mamba architecture, supporting English and multilingual tasks.

Large Language Model

Transformers

Open Source License:Other #Hybrid Architecture #Multilingual Reasoning #Efficient Mathematical Computation

Downloads 380

Release Time : 5/1/2025

Model Overview

The Falcon-H1 series models adopt an innovative hybrid architecture, combining the strengths of Transformers and Mamba to deliver exceptional performance while maintaining efficient inference.

Model Features

Hybrid Architecture Innovation

Combines the advantages of Transformers and Mamba architectures to achieve efficient inference and outstanding performance

Multilingual Support

Supports English and multilingual task processing

Efficient Inference

Optimized architecture provides fast inference speeds

Model Capabilities

Text generation

Logical reasoning

Mathematical computation

Code generation

Instruction following

Use Cases

Education

Math Problem Solving

Solves various math problems, including benchmarks like GSM8k

Achieved 84.76% accuracy on the GSM8k benchmark

Programming

Code Generation

Generates code based on natural language descriptions

Achieved 76.83% accuracy on the HumanEval benchmark

General AI Assistant

Instruction Following

Understands and executes complex instructions

Achieved 85.05% accuracy on the IFEval benchmark

🚀 Transformers Library for Falcon-H1

This library provides access to the Falcon-H1 series of models, offering high performance across various natural language processing tasks. It supports multiple inference methods and has been well-evaluated on a range of benchmarks.

🚀 Quick Start

To quickly get started with the Falcon-H1 models, you can choose from different inference methods as described in the Usage section.

✨ Features

Hybrid Architecture: Combines Transformers and Mamba architecture for enhanced performance.
Multilingual Support: Supports English and other languages.
Multiple Inference Options: Can be used with Hugging Face transformers, vLLM, or a custom fork of llama.cpp.

📦 Installation

To use this model, you need to install the necessary libraries. Here are the installation commands for different inference methods:

Install `transformers` from source

pip install git+https://github.com/huggingface/transformers.git

Install `vLLM`

pip install vllm

Install custom fork of `llama.cpp`

You can install the custom fork of llama.cpp from here. Follow the same installation guidelines as llama.cpp.

💻 Usage Examples

Basic Usage

Using `transformers`

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "tiiuae/Falcon-H1-1B-Base"

model = AutoModelForCausalLM.from_pretrained(
  model_id,
  torch_dtype=torch.bfloat16,
  device_map="auto"
)

# Perform text generation

Using `vLLM`

# pip install vllm
vllm serve tiiuae/Falcon-H1-1B-Instruct --tensor-parallel-size 2 --data-parallel-size 1

Advanced Usage

While we are working on integrating our architecture directly into llama.cpp library, you can use our custom fork for now. Refer to https://github.com/tiiuae/llama.cpp-Falcon-H1 for installation and usage details.

📚 Documentation

Model Details

Property	Details
Model Type	Causal decoder-only
Architecture	Hybrid Transformers + Mamba architecture
Language(s) (NLP)	English, Multilingual
License	Falcon-LLM License
Developed by	https://www.tii.ae

Training Details

For more details about the training protocol of this model, please refer to the Falcon-H1 technical blogpost.

Evaluation

Falcon-H1 series perform very well on a variety of tasks, including reasoning tasks.

Tasks	Falcon-H1-3B	Qwen3-4B	Qwen2.5-3B	Gemma3-4B	Llama3.2-3B	Falcon3-3B
General
BBH	53.69	51.07	46.55	50.01	41.47	45.02
ARC-C	49.57	37.71	43.77	44.88	44.88	48.21
TruthfulQA	53.19	51.75	58.11	51.68	50.27	50.06
HellaSwag	69.85	55.31	64.21	47.68	63.74	64.24
MMLU	68.3	67.01	65.09	59.53	61.74	56.76
Math
GSM8k	84.76	80.44	57.54	77.41	77.26	74.68
MATH-500	74.2	85.0	64.2	76.4	41.2	54.2
AMC-23	55.63	66.88	39.84	48.12	22.66	29.69
AIME-24	11.88	22.29	6.25	6.67	11.67	3.96
AIME-25	13.33	18.96	3.96	13.33	0.21	2.29
Science
GPQA	33.89	28.02	28.69	29.19	28.94	28.69
GPQA_Diamond	38.72	40.74	35.69	28.62	29.97	29.29
MMLU-Pro	43.69	29.75	32.76	29.71	27.44	29.71
MMLU-stem	69.93	67.46	59.78	52.17	51.92	56.11
Code
HumanEval	76.83	84.15	73.78	67.07	54.27	52.44
HumanEval+	70.73	76.83	68.29	61.59	50.0	45.73
MBPP	79.63	68.78	72.75	77.78	62.17	61.9
MBPP+	67.46	59.79	60.85	66.93	50.53	55.29
LiveCodeBench	26.81	39.92	11.74	21.14	2.74	3.13
CRUXEval	56.25	69.63	43.26	52.13	17.75	44.38
Instruction Following
IFEval	85.05	84.01	64.26	77.01	74.0	69.1
Alpaca-Eval	31.09	36.51	17.37	39.64	19.69	14.82
MTBench	8.72	8.45	7.79	8.24	7.96	7.79
LiveBench	36.86	51.34	27.32	36.7	26.37	26.01

You can check more in detail on our release blogpost, detailed benchmarks.

📄 License

This model is licensed under the Falcon-LLM License.

📚 Citation

If the Falcon-H1 family of models were helpful to your work, feel free to cite us using the following BibTeX entry:

@misc{tiifalconh1,
    title = {Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance},
    url = {https://falcon-lm.github.io/blog/falcon-h1},
    author = {Falcon-LLM Team},
    month = {May},
    year = {2025}
}

🔗 Useful links

View our release blogpost.
Feel free to join our discord server if you have any questions or to interact with our researchers and developers.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご