Granite-8B-Code-Instruct-4K Open-Source Code Instruction Model, Enhancing Logical Reasoning and Problem-Solving Abilities

Granite 8b Code Instruct 4k

Developed by ibm-granite

Granite-8B-Code-Instruct-4K is an 8-billion-parameter code instruction model, fine-tuned on various permissible instruction datasets based on Granite-8B-Code-Base-4K, enhancing its ability to follow instructions, including logical reasoning and problem-solving skills.

Large Language Model

Transformers

OtherOpen Source License:Apache-2.0 #Multilingual Code Generation #Instruction Fine-Tuning Optimization #Programming Problem Solving

Downloads 1,481

Release Time : 4/26/2024

Model Overview

This model is designed to respond to coding-related instructions and can be used to build coding assistants.

Model Features

Multilingual Code Generation

Supports code generation in multiple programming languages, including Python, JavaScript, Java, Go, C++, and Rust.

Logical Reasoning and Problem Solving

Enhanced logical reasoning and problem-solving capabilities through fine-tuning, enabling the handling of complex programming tasks.

High-Quality Training Data

Trained on various high-quality datasets, including code submissions, mathematical problems, and code instruction data.

Model Capabilities

Code Generation

Code Explanation

Code Repair

Logical Reasoning

Problem Solving

Use Cases

Programming Assistance

Code Generation

Generates code snippets for specific functionalities based on user instructions.

Achieved a pass@1 score of 57.9 for Python code generation in the HumanEvalSynthesis task.

Code Explanation

Explains the functionality and logic of given code.

Achieved a pass@1 score of 53.0 for Python code explanation in the HumanEvalExplain task.

Code Repair

Fixes errors in given code.

Achieved a pass@1 score of 39.6 for Python code repair in the HumanEvalFix task.

🚀 Granite-8B-Code-Instruct-4K

Granite-8B-Code-Instruct-4K is an 8B parameter model fine-tuned from Granite-8B-Code-Base-4K. It's optimized on a combination of permissively licensed instruction data to enhance instruction-following capabilities, including logical reasoning and problem-solving skills.

image/png

🚀 Quick Start

The model is designed to respond to coding-related instructions and can be used to build coding assistants. Here is a simple example of how to use the Granite-8B-Code-Instruct-4K model:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # or "cpu"
model_path = "ibm-granite/granite-8b-code-instruct-4k"
tokenizer = AutoTokenizer.from_pretrained(model_path)
# drop device_map if running on CPU
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()
# change input text as desired
chat = [
    { "role": "user", "content": "Write a code to find the maximum value in a list of numbers." },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
# tokenize the text
input_tokens = tokenizer(chat, return_tensors="pt")
# transfer tokenized inputs to the device
for i in input_tokens:
    input_tokens[i] = input_tokens[i].to(device)
# generate output tokens
output = model.generate(**input_tokens, max_new_tokens=100)
# decode output tokens into text
output = tokenizer.batch_decode(output)
# loop over the batch to print, in this example the batch size is 1
for i in output:
    print(i)

✨ Features

Enhanced Instruction Following: Fine-tuned to better follow coding instructions, improving logical reasoning and problem-solving abilities.
Multilingual Support: Trained on data from 92 programming languages, offering broad language coverage.

📦 Installation

No specific installation steps were provided in the original README.

📚 Documentation

Model Summary

Developers: IBM Research
GitHub Repository: ibm-granite/granite-code-models
Paper: Granite Code Models: A Family of Open Foundation Models for Code Intelligence
Release Date: May 6th, 2024
License: Apache 2.0

Training Data

Granite Code Instruct models are trained on the following types of data:

Code Commits Datasets: Sourced from the CommitPackFT dataset, considering data for 92 programming languages.
Math Datasets: Using MathInstruct and MetaMathQA, with some filtering due to license issues.
Code Instruction Datasets: Including Glaive-Code-Assistant-v3, Glaive-Function-Calling-v2, NL2SQL11 and synthetic API calling datasets.
Language Instruction Datasets: Incorporating HelpSteer, an open license-filtered version of Platypus, and hardcoded prompts.

Infrastructure

The Granite Code models are trained using two of IBM's supercomputing clusters, Vela and Blue Vela, equipped with NVIDIA A100 and H100 GPUs respectively, providing a scalable and efficient training infrastructure.

Ethical Considerations and Limitations

The model's performance may be limited with out-of-domain programming languages. It's beneficial to provide few-shot examples. Developers should perform safety testing and target-specific tuning before deploying on critical applications. The model also inherits ethical considerations and limitations from its base model. For more information, refer to the Granite-8B-Code-Base-4K model card.

🔧 Technical Details

Model Index

Task Type	Dataset Type	Dataset Name	Metric Name	Metric Type	Value	Verified
text-generation	bigcode/humanevalpack	HumanEvalSynthesis(Python)	pass@1	pass@1	57.9	false
text-generation	bigcode/humanevalpack	HumanEvalSynthesis(JavaScript)	pass@1	pass@1	52.4	false
text-generation	bigcode/humanevalpack	HumanEvalSynthesis(Java)	pass@1	pass@1	58.5	false
text-generation	bigcode/humanevalpack	HumanEvalSynthesis(Go)	pass@1	pass@1	43.3	false
text-generation	bigcode/humanevalpack	HumanEvalSynthesis(C++)	pass@1	pass@1	48.2	false
text-generation	bigcode/humanevalpack	HumanEvalSynthesis(Rust)	pass@1	pass@1	37.2	false
text-generation	bigcode/humanevalpack	HumanEvalExplain(Python)	pass@1	pass@1	53.0	false
text-generation	bigcode/humanevalpack	HumanEvalExplain(JavaScript)	pass@1	pass@1	42.7	false
text-generation	bigcode/humanevalpack	HumanEvalExplain(Java)	pass@1	pass@1	52.4	false
text-generation	bigcode/humanevalpack	HumanEvalExplain(Go)	pass@1	pass@1	36.6	false
text-generation	bigcode/humanevalpack	HumanEvalExplain(C++)	pass@1	pass@1	43.9	false
text-generation	bigcode/humanevalpack	HumanEvalExplain(Rust)	pass@1	pass@1	16.5	false
text-generation	bigcode/humanevalpack	HumanEvalFix(Python)	pass@1	pass@1	39.6	false
text-generation	bigcode/humanevalpack	HumanEvalFix(JavaScript)	pass@1	pass@1	40.9	false
text-generation	bigcode/humanevalpack	HumanEvalFix(Java)	pass@1	pass@1	48.2	false
text-generation	bigcode/humanevalpack	HumanEvalFix(Go)	pass@1	pass@1	41.5	false
text-generation	bigcode/humanevalpack	HumanEvalFix(C++)	pass@1	pass@1	39.0	false
text-generation	bigcode/humanevalpack	HumanEvalFix(Rust)	pass@1	pass@1	32.9	false

📄 License

The model is licensed under the Apache 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご