đ Granite-3B-Code-Instruct-128K
Granite-3B-Code-Instruct-128K is a 3B parameter long-context instruct model. It's fine - tuned from Granite-3B-Code-Base-128K on a combination of permissively licensed data used in training the original Granite code instruct models, along with synthetically generated code instruction datasets for long - context problem - solving. The goal is to enhance long - context capability without sacrificing short - input context code generation performance.

⨠Features
- Long - Context Capability: Can handle coding instructions with long - context input up to 128K length.
- Code Generation: Suitable for building coding assistants.
đĻ Installation
The installation details are mainly about setting up the necessary Python libraries. You need to have torch
and transformers
installed. You can install them using pip
:
pip install torch transformers
đģ Usage Examples
Basic Usage
This is a simple example of how to use the Granite-3B-Code-Instruct model.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda"
model_path = "ibm-granite/granite-3b-code-instruct-128k"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()
chat = [
{ "role": "user", "content": "Write a code to find the maximum value in a list of numbers." },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
input_tokens = tokenizer(chat, return_tensors="pt")
for i in input_tokens:
input_tokens[i] = input_tokens[i].to(device)
output = model.generate(**input_tokens, max_new_tokens=100)
output = tokenizer.batch_decode(output)
for i in output:
print(i)
đ Documentation
Intended Use
The model is designed to respond to coding - related instructions over long - context input up to 128K length and can be used to build coding assistants.
Model Information
Training Data
The Granite Code Instruct models are trained on a mix of short and long context data:
- Short - Context Instruction Data: CommitPackFT, BigCode - SC2 - Instruct, [MathInstruct](https://huggingface.co/datasets/TIGER - Lab/MathInstruct), [MetaMathQA](https://huggingface.co/datasets/meta - math/MetaMathQA), [Glaive - Code - Assistant - v3](https://huggingface.co/datasets/glaiveai/glaive - code - assistant - v3), [Glaive - Function - Calling - v2](https://huggingface.co/datasets/glaiveai/glaive - function - calling - v2), [NL2SQL11](https://huggingface.co/datasets/bugdaryan/sql - create - context - instruction), HelpSteer, [OpenPlatypus](https://huggingface.co/datasets/garage - bAInd/Open - Platypus). It also includes a synthetically generated dataset for API calling and multi - turn code interactions with execution feedback, and a collection of hardcoded prompts.
- Long - Context Instruction Data: A synthetically - generated dataset by bootstrapping the repository - level file - packed documents through Granite - 8b - Code - Instruct to improve the long - context capability of the model.
Infrastructure
The Granite Code models are trained using two of IBM's super - computing clusters, Vela and Blue Vela, which are equipped with NVIDIA A100 and H100 GPUs respectively. These clusters offer a scalable and efficient infrastructure for training the models over thousands of GPUs.
Model Performance
The model has the following performance metrics:
Task |
Dataset |
Metric |
Value |
Text Generation |
bigcode/humanevalpack (HumanEvalSynthesis (Python)) |
pass@1 |
53.7 |
Text Generation |
bigcode/humanevalpack (HumanEvalSynthesis (Average)) |
pass@1 |
41.4 |
Text Generation |
bigcode/humanevalpack (HumanEvalExplain (Average)) |
pass@1 |
25.1 |
Text Generation |
bigcode/humanevalpack (HumanEvalFix (Average)) |
pass@1 |
26.2 |
Text Generation |
repoqa (RepoQA (Python@16K)) |
pass@1 (thresh = 0.5) |
48.0 |
Text Generation |
repoqa (RepoQA (C++@16K)) |
pass@1 (thresh = 0.5) |
36.0 |
Text Generation |
repoqa (RepoQA (Java@16K)) |
pass@1 (thresh = 0.5) |
38.0 |
Text Generation |
repoqa (RepoQA (TypeScript@16K)) |
pass@1 (thresh = 0.5) |
39.0 |
Text Generation |
repoqa (RepoQA (Rust@16K)) |
pass@1 (thresh = 0.5) |
29.0 |
đ§ Technical Details
The model is fine - tuned from Granite-3B-Code-Base-128K. By exposing it to both short and long context data, the developers aim to enhance its long - context capability without sacrificing code generation performance at short input context.
đ License
The model is released under the Apache 2.0 license.
â ī¸ Important Note
- The model is primarily finetuned using instruction - response pairs across a specific set of programming languages. Its performance may be limited with out - of - domain programming languages. In such cases, providing few - shot examples can help steer the model's output.
- Developers should perform safety testing and target - specific tuning before deploying these models on critical applications. The model also inherits ethical considerations and limitations from its base model. For more information, refer to the [Granite-3B-Code-Base-128K](https://huggingface.co/ibm - granite/granite - 3b - code - base - 128k) model card.