đ CodeParrot đĻ
CodeParrot đĻ is a GPT - 2 model with 1.5B parameters, specifically trained to generate Python code. After the initial training and the release of v1.0, we further trained the model and released v1.1. Details are provided below.
đ Quick Start
⨠Features
- CodeParrot is designed to generate Python code, leveraging the power of a GPT - 2 architecture.
- It has been trained in multiple steps, with an updated version (v1.1) showing improved performance on code - generation benchmarks.
đĻ Installation
The installation process is mainly about using the transformers
library to load the model and tokenizer. There are two common ways to use CodeParrot:
đģ Usage Examples
Basic Usage
You can load the CodeParrot model and tokenizer directly in transformers
:
from transformers import AutoTokenizer, AutoModelWithLMHead
tokenizer = AutoTokenizer.from_pretrained("codeparrot/codeparrot")
model = AutoModelWithLMHead.from_pretrained("codeparrot/codeparrot")
inputs = tokenizer("def hello_world():", return_tensors="pt")
outputs = model(**inputs)
Advanced Usage
Or you can use a pipeline
:
from transformers import pipeline
pipe = pipeline("text - generation", model="codeparrot/codeparrot")
outputs = pipe("def hello_world():")
đ§ Technical Details
Training
The model was trained on the cleaned CodeParrot đĻ dataset in two steps. After the initial training (v1.0), the model was trained for another 30k steps, resulting in v1.1. The training settings are shown in the following table:
Property |
Details |
Batch size |
512 for both v1.0 and v1.1 |
Context size |
1024 for both v1.0 and v1.1 |
Training steps |
50,000 for v1.0; 30,000 for v1.1 |
Gradient accumulation |
16 for both v1.0 and v1.1 |
Gradient checkpointing |
True for both v1.0 and v1.1 |
Learning rate |
2e - 4 for v1.0; 5e - 5 for v1.1 |
Weight decay |
0.1 for both v1.0 and v1.1 |
Warmup steps |
750 for both v1.0 and v1.1 |
Schedule |
Cosine for both v1.0 and v1.1 |
The training was executed on 16 x A100 (40GB) GPUs, which amounts to roughly 26 + 15 billion tokens.
Performance
We evaluated the model on OpenAI's HumanEval benchmark, which consists of programming challenges. The performance metrics are as follows:
Metric |
v1.0 |
v1.1 |
pass@1 |
3.58% |
3.99% |
pass@10 |
8.03% |
8.69% |
pass@100 |
14.96% |
17.88% |
The pass@k metric indicates the probability that at least one out of k generations passes the tests.
đ Documentation
Here are some useful resources related to CodeParrot: