pipeline_tag: text-generation
inference: false
license: apache-2.0
library_name: transformers
tags:
- language
- granite-4.0
base_model:
- ibm-granite/granite-4.0-tiny-base-preview
Granite-4.0-Tiny-Preview
Model Summary:
Granite-4-Tiny-Preview is a 7B parameter fine-grained hybrid mixture-of-experts (MoE) instruct model finetuned from Granite-4.0-Tiny-Base-Preview using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets tailored for solving long context problems. This model is developed using a diverse set of techniques with a structured chat format, including supervised finetuning, and model alignment using reinforcement learning.
Supported Languages:
English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. However, users may finetune this Granite model for languages beyond these 12 languages.
Intended Use:
This model is designed to handle general instruction-following tasks and can be integrated into AI assistants across various domains, including business applications.
Capabilities
- Thinking
- Summarization
- Text classification
- Text extraction
- Question-answering
- Retrieval Augmented Generation (RAG)
- Code related tasks
- Function-calling tasks
- Multilingual dialog use cases
- Long-context tasks including long document/meeting summarization, long document QA, etc.
Installation:
You need to install transformer from source to use this checkpoint.
HuggingFace PR: https://github.com/huggingface/transformers/pull/37658
Install transformer from source: https://huggingface.co/docs/transformers/en/installation#install-from-source
Generation:
After installation, copy the code snippet below to run the example.
from transformers import AutoModelForCausalLM, AutoTokenizer, set_seed
import torch
model_path="ibm-granite/granite-4.0-tiny-preview"
device="cuda"
model = AutoModelForCausalLM.from_pretrained(
model_path,
device_map=device,
torch_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained(
model_path
)
conv = [{"role": "user", "content":"You have 10 liters of a 30% acid solution. How many liters of a 70% acid solution must be added to achieve a 50% acid mixture?"}]
input_ids = tokenizer.apply_chat_template(conv, return_tensors="pt", thinking=True, return_dict=True, add_generation_prompt=True).to(device)
set_seed(42)
output = model.generate(
**input_ids,
max_new_tokens=8192,
)
prediction = tokenizer.decode(output[0, input_ids["input_ids"].shape[1]:], skip_special_tokens=True)
print(prediction)
Evaluation Results:
Comparison with previous granite models1. Scores of AlpacaEval-2.0 and Arena-Hard are calculated with thinking=True
Models |
Arena-Hard |
AlpacaEval-2.0 |
MMLU |
PopQA |
TruthfulQA |
BigBenchHard |
DROP |
GSM8K |
HumanEval |
HumanEval+ |
IFEval |
AttaQ |
Granite-3.3-2B-Instruct |
28.86 |
43.45 |
55.88 |
18.4 |
58.97 |
52.51 |
35.98 |
72.48 |
80.51 |
75.68 |
65.8 |
87.47 |
Granite-3.3-8B-Instruct |
57.56 |
62.68 |
65.54 |
26.17 |
66.86 |
59.01 |
41.53 |
80.89 |
89.73 |
86.09 |
74.82 |
88.5 |
Granite-4.0-Tiny-Preview |
26.70 |
35.16 |
60.40 |
22.93 |
58.07 |
55.71 |
46.22 |
70.05 |
82.41 |
78.33 |
63.03 |
86.10 |
Training Data:
Overall, our training data is largely comprised of two key sources: (1) publicly available datasets with permissive license, (2) internal synthetically generated data targeted to enhance reasoning capabilites.
Infrastructure:
We train Granite-4.0-Tiny-Preview using IBM's super computing cluster, Blue Vela, which is outfitted with NVIDIA H100 GPUs. This cluster provides a scalable and efficient infrastructure for training our models over thousands of GPUs.
Ethical Considerations and Limitations:
Granite-4.0-Tiny-Preview, leverages both permissively licensed open-source and select proprietary data for enhanced performance. Since it inherits its foundation from the previous model, all ethical considerations and limitations applicable to Granite-4.0-Tiny-Preview remain relevant.
Resources
- ⭐️ Learn about the latest updates with Granite: https://www.ibm.com/granite
- üìÑ Get started with tutorials, best practices, and prompt engineering advice: https://www.ibm.com/granite/docs/
- üí° Learn about the latest Granite learning resources: https://ibm.biz/granite-learning-resources