Falcon-H1-0.5B-Instruct Open-Source Language Model - Supports English Conversations, Exceptional Performance, Free to Use

Falcon H1 0.5B Instruct

Developed by tiiuae

Falcon-H1 is a causal decoder-only language model with a hybrid Transformers+Mamba architecture developed by TII, supporting English with outstanding performance.

Large Language Model

Transformers

Open Source License:Other #Hybrid Architecture Inference #Enhanced Mathematical Capabilities #Programming Task Optimization

Downloads 492

Release Time : 5/1/2025

Model Overview

The Falcon-H1 series models adopt an innovative hybrid architecture, combining the strengths of Transformers and Mamba, excelling in tasks such as reasoning, mathematics, science, and programming.

Model Features

Hybrid Architecture Innovation

Combines the advantages of Transformers and Mamba architectures for efficient inference

Outstanding Performance

Surpasses peer models in multiple benchmarks such as BBH and GSM8k

Multi-Domain Capabilities

Demonstrates strong capabilities in mathematics, science, programming, and more

Model Capabilities

Text Generation

Mathematical Reasoning

Scientific Problem Solving

Code Generation

Instruction Following

Use Cases

Education

Mathematical Problem Solving

Helps students solve complex mathematical problems

Achieved 68.39 points in the GSM8k test

Programming Assistance

Code Generation

Generates code based on natural language descriptions

Achieved 51.83 points in the HumanEval test

🚀 Falcon-H1 Model

The Falcon-H1 is a powerful hybrid-head language model developed by TII, offering high efficiency and performance in various NLP tasks.

🚀 Quick Start

Currently, you can use this model with Hugging Face transformers, vLLM, or our custom fork of the llama.cpp library.

Installation

Make sure to install the latest version of transformers or vLLM. You can install these packages from source:

pip install git+https://github.com/huggingface/transformers.git

Refer to the official vLLM documentation for more details on building vLLM from source.

Inference

🤗 transformers

Refer to the snippet below to run H1 models using 🤗 transformers:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "tiiuae/Falcon-H1-1B-Base"

model = AutoModelForCausalLM.from_pretrained(
  model_id,
  torch_dtype=torch.bfloat16,
  device_map="auto"
)

# Perform text generation

vLLM

For vLLM, simply start a server by executing the command below:

# pip install vllm
vllm serve tiiuae/Falcon-H1-1B-Instruct --tensor-parallel-size 2 --data-parallel-size 1

`llama.cpp`

While we are working on integrating our architecture directly into the llama.cpp library, you can install our fork of the library and use it directly: https://github.com/tiiuae/llama.cpp-Falcon-H1. Use the same installing guidelines as llama.cpp.

✨ Features

Model Type: Causal decoder-only
Architecture: Hybrid Transformers + Mamba architecture
Language(s) (NLP): English
License: Falcon-LLM License

📦 Installation

The installation steps are included in the "Quick Start" section above.

💻 Usage Examples

Basic Usage

The basic usage examples are provided in the "Quick Start" section for different libraries.

📚 Documentation

Model Details

Developed by: https://www.tii.ae
Model type: Causal decoder-only
Architecture: Hybrid Transformers + Mamba architecture
Language(s) (NLP): English
License: Falcon-LLM License

Training Details

For more details about the training protocol of this model, please refer to the Falcon-H1 technical blogpost.

Evaluation

Falcon-H1 series perform very well on a variety of tasks, including reasoning tasks.

Tasks	Falcon-H1-0.5B	Qwen3-0.6B	Qwen2.5-0.5B	Gemma3-1B	Llama3.2-1B	Falcon3-1B
General
BBH	42.91	32.95	33.26	35.86	33.21	34.47
ARC-C	37.8	31.06	33.28	34.13	34.64	43.09
TruthfulQA	44.12	51.65	46.19	42.17	42.08	42.31
HellaSwag	51.93	42.17	52.38	42.24	55.3	58.53
MMLU	53.4	42.98	46.07	40.87	45.93	46.1
Math
GSM8k	68.39	42.61	38.51	42.38	44.28	44.05
MATH-500	58.4	46.0	27.8	45.4	13.2	19.8
AMC-23	33.13	27.97	12.5	19.22	7.19	6.87
AIME-24	3.75	2.71	0.62	0.42	1.46	0.41
AIME-25	4.38	1.67	0.21	1.25	0.0	0.21
Science
GPQA	29.95	26.09	26.85	28.19	26.59	26.76
GPQA_Diamond	27.95	25.08	24.24	21.55	25.08	31.31
MMLU-Pro	31.03	16.95	18.73	14.46	16.2	18.49
MMLU-stem	54.55	39.3	39.83	35.39	39.16	39.64
Code
HumanEval	51.83	41.46	36.59	40.85	34.15	22.56
HumanEval+	45.12	37.19	32.32	37.2	29.88	20.73
MBPP	42.59	56.08	46.83	57.67	33.6	20.63
MBPP+	33.07	47.08	39.68	50.0	29.37	17.2
LiveCodeBench	7.05	9.78	2.94	5.09	2.35	0.78
CRUXEval	25.75	23.63	14.88	12.7	0.06	15.58
Instruction Following
IFEval	72.07	62.16	32.11	61.48	55.34	54.26
Alpaca-Eval	10.79	9.59	3.26	17.87	9.38	6.98
MTBench	7.06	5.75	4.71	7.03	6.37	6.03
LiveBench	20.8	27.78	14.27	18.79	14.97	14.1

You can check more in detail on our our release blogpost, detailed benchmarks.

Useful Links

View our release blogpost.
Feel free to join our discord server if you have any questions or to interact with our researchers and developers.

📄 License

This model is under the Falcon-LLM License. You can find more details at https://falconllm.tii.ae/falcon-terms-and-conditions.html.

📚 Citation

If the Falcon-H1 family of models were helpful to your work, feel free to give us a cite.

@misc{tiifalconh1,
    title = {Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance},
    url = {https://falcon-lm.github.io/blog/falcon-h1},
    author = {Falcon-LLM Team},
    month = {May},
    year = {2025}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご