Granite-3b-code-instruct-2k Open-source Code Model - Free Deployment to Boost Code Generation and Logical Reasoning

Granite 3b Code Instruct 2k

Developed by ibm-granite

Granite-3B-Code-Instruct-2K is a 3-billion-parameter model fine-tuned from Granite-3B-Code-Base-2K, with enhanced instruction-following capabilities, particularly excelling in code generation and logical reasoning tasks.

Large Language Model

Transformers

OtherOpen Source License:Apache-2.0 #Multilingual Code Generation #Instruction Fine-Tuning Optimization #Mathematical Reasoning Enhancement

Downloads 1,883

Release Time : 4/26/2024

Model Overview

This model is designed to respond to coding-related instructions and can be used to build coding assistants, supporting code generation, explanation, and repair in multiple programming languages.

Model Features

Multilingual Code Support

Supports code generation, explanation, and repair tasks in multiple programming languages

Instruction Fine-Tuning

Enhanced instruction-following capability through high-quality instruction datasets

Logical Reasoning Ability

Trained with mathematical datasets, demonstrating strong logical reasoning and problem-solving skills

Model Capabilities

Code Generation

Code Explanation

Code Repair

Logical Reasoning

Problem Solving

Use Cases

Programming Assistance

Code Generation

Generate code in multiple programming languages based on natural language descriptions

Achieved a pass@1 score of 51.2% in Python code generation on the HumanEvalSynthesis test

Code Explanation

Explain the functionality and logic of given code

Achieved a pass@1 score of 39.6% in Python code explanation on the HumanEvalExplain test

Code Repair

Identify and fix errors in code

Achieved a pass@1 score of 33.5% in Java code repair on the HumanEvalFix test

🚀 Granite-3B-Code-Instruct-2K

Granite-3B-Code-Instruct-2K is a 3B parameter model. It's fine - tuned from Granite-3B-Code-Base-2K on a combination of permissively licensed instruction data. This enhances its instruction - following capabilities, including logical reasoning and problem - solving skills.

image/png

🚀 Quick Start

Intended use

The model is designed to respond to coding - related instructions and can be used to build coding assistants.

Generation

This is a simple example of how to use the Granite-3B-Code-Instruct-2K model.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # or "cpu"
model_path = "ibm-granite/granite-3b-code-instruct-2k"
tokenizer = AutoTokenizer.from_pretrained(model_path)
# drop device_map if running on CPU
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()
# change input text as desired
chat = [
    { "role": "user", "content": "Write a code to find the maximum value in a list of numbers." },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
# tokenize the text
input_tokens = tokenizer(chat, return_tensors="pt")
# transfer tokenized inputs to the device
for i in input_tokens:
    input_tokens[i] = input_tokens[i].to(device)
# generate output tokens
output = model.generate(**input_tokens, max_new_tokens=100)
# decode output tokens into text
output = tokenizer.batch_decode(output)
# loop over the batch to print, in this example the batch size is 1
for i in output:
    print(i)

✨ Features

Enhanced Instruction - Following: Fine - tuned on permissively licensed instruction data to improve logical reasoning and problem - solving skills.
Coding - Oriented: Designed to handle coding - related instructions, suitable for building coding assistants.

📦 Installation

No installation steps are provided in the original README, so this section is skipped.

💻 Usage Examples

Basic Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # or "cpu"
model_path = "ibm-granite/granite-3b-code-instruct-2k"
tokenizer = AutoTokenizer.from_pretrained(model_path)
# drop device_map if running on CPU
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()
# change input text as desired
chat = [
    { "role": "user", "content": "Write a code to find the maximum value in a list of numbers." },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
# tokenize the text
input_tokens = tokenizer(chat, return_tensors="pt")
# transfer tokenized inputs to the device
for i in input_tokens:
    input_tokens[i] = input_tokens[i].to(device)
# generate output tokens
output = model.generate(**input_tokens, max_new_tokens=100)
# decode output tokens into text
output = tokenizer.batch_decode(output)
# loop over the batch to print, in this example the batch size is 1
for i in output:
    print(i)

Advanced Usage

No advanced usage example is provided in the original README, so this part is skipped.

📚 Documentation

Model Summary

Developers: IBM Research
GitHub Repository: ibm-granite/granite-code-models
Paper: Granite Code Models: A Family of Open Foundation Models for Code Intelligence
Release Date: May 6th, 2024
License: Apache 2.0.

Training Data

Granite Code Instruct models are trained on the following types of data:

Code Commits Datasets: Code commits data is sourced from the CommitPackFT dataset, a filtered version of the full CommitPack dataset. Only data for 92 programming languages is considered. The inclusion criteria is to select programming languages common across CommitPackFT and the 116 languages used to pretrain the code - base model (Granite-3B-Code-Base).
Math Datasets: Two high - quality math datasets, MathInstruct and MetaMathQA, are used. Due to license issues, GSM8K - RFT and Camel - Math are filtered out from the MathInstruct dataset.
Code Instruction Datasets: Glaive - Code - Assistant - v3, Glaive - Function - Calling - v2, NL2SQL11 and a small collection of synthetic API calling datasets are used.
Language Instruction Datasets: High - quality datasets such as HelpSteer and an open license - filtered version of Platypus are included. A collection of hardcoded prompts is also added to ensure the model generates correct outputs for inquiries about its name or developers.

Infrastructure

The Granite Code models are trained using two of IBM's super - computing clusters, Vela and Blue Vela, which are equipped with NVIDIA A100 and H100 GPUs respectively. These clusters provide a scalable and efficient infrastructure for training the models over thousands of GPUs.

Ethical Considerations and Limitations

Granite code instruct models are primarily finetuned using instruction - response pairs across a specific set of programming languages. Thus, their performance may be limited with out - of - domain programming languages. In this situation, it is beneficial to provide few - shot examples to steer the model's output. Moreover, developers should perform safety testing and target - specific tuning before deploying these models on critical applications. The model also inherits ethical considerations and limitations from its base model. For more information, please refer to the Granite-3B-Code-Base-2K model card.

🔧 Technical Details

No specific technical details (more than 50 - word description) are provided in the original README, so this section is skipped.

📄 License

The model is released under the Apache 2.0 license.

📊 Model Information

Property	Details
Pipeline Tag	text - generation
Base Model	ibm - granite/granite - 3b - code - base - 2k
Inference	false
License	apache - 2.0
Datasets	bigcode/commitpackft, TIGER - Lab/MathInstruct, meta - math/MetaMathQA, glaiveai/glaive - code - assistant - v3, glaive - function - calling - v2, bugdaryan/sql - create - context - instruction, garage - bAInd/Open - Platypus, nvidia/HelpSteer
Metrics	code_eval
Library Name	transformers
Tags	code, granite
Model Index	Name: granite - 3b - code - instruct; Results include various text - generation tasks on different datasets with pass@1 metrics for multiple programming languages.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご