đ Refact-1.6B
This is a text generation model. After fine - tuning on generated data, it outperforms many other models such as Replit 3b and Stability Code 3b, and almost rivals StarCoder which is ten times its size. It's likely the best model for practical code completion use in your IDE due to its intelligence and speed.
⨠Features
- Capable of text generation, especially useful for code - related tasks.
- Outperforms several models in terms of HumanEval pass@1 and pass@10 metrics despite its relatively small size.
đ Documentation
Model Information
Property |
Details |
Pipeline Tag |
text - generation |
Inference |
true |
Library Name |
transformers |
Tags |
code |
Datasets
Pretrain Datasets
- books
- arxiv
- c4
- falcon - refinedweb
- wiki
- github - issues
- stack_markdown
- self - made dataset of permissive github code
Training Datasets
- bigcode/the - stack - dedup
- rombodawg/2XUNCENSORED_MegaCodeTraining188k
- bigcode/commitpackft
Metrics
The model is evaluated using the code_eval
metric.
Model Index
The model named Refact - 1.6B has the following evaluation results on different tasks and datasets:
HumanEval
Task |
Dataset |
Metric |
Value |
Verified |
text - generation |
openai_humaneval (HumanEval) |
pass@1 (T = 0.01) |
32.0 |
false |
text - generation |
openai_humaneval (HumanEval) |
pass@1 (T = 0.2) |
31.5 |
false |
text - generation |
openai_humaneval (HumanEval) |
pass@10 (T = 0.8) |
53.0 |
false |
text - generation |
openai_humaneval (HumanEval) |
pass@100 (T = 0.8) |
76.9 |
false |
HumanEvalSynthesize
Task |
Dataset |
Metric |
Value |
Verified |
text - generation |
bigcode/humanevalpack (HumanEvalSynthesize Python) |
pass@1 (T = 0.2) |
35.8 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalSynthesize JavaScript) |
pass@1 (T = 0.2) |
31.6 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalSynthesize Java) |
pass@1 (T = 0.2) |
29.1 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalSynthesize Go) |
pass@1 (T = 0.2) |
-1 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalSynthesize C++) |
pass@1 (T = 0.2) |
26.3 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalSynthesize Rust) |
pass@1 (T = 0.2) |
-1 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalSynthesize Average) |
pass@1 (T = 0.2) |
-1 |
false |
HumanEvalFixTests
Task |
Dataset |
Metric |
Value |
Verified |
text - generation |
bigcode/humanevalpack (HumanEvalFixTests Python) |
pass@1 (T = 0.2) |
18.38 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalFixTests JavaScript) |
pass@1 (T = 0.2) |
12.28 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalFixTests Java) |
pass@1 (T = 0.2) |
15.12 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalFixTests Go) |
pass@1 (T = 0.2) |
-1 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalFixTests C++) |
pass@1 (T = 0.2) |
13.17 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalFixTests Rust) |
pass@1 (T = 0.2) |
2.8 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalFixTests Average) |
pass@1 (T = 0.2) |
-1 |
false |
HumanEvalFixDocs
Task |
Dataset |
Metric |
Value |
Verified |
text - generation |
bigcode/humanevalpack (HumanEvalFixDocs Python) |
pass@1 (T = 0.2) |
26.92 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalFixDocs JavaScript) |
pass@1 (T = 0.2) |
26.85 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalFixDocs Java) |
pass@1 (T = 0.2) |
30.76 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalFixDocs Go) |
pass@1 (T = 0.2) |
-1 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalFixDocs C++) |
pass@1 (T = 0.2) |
25.94 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalFixDocs Rust) |
pass@1 (T = 0.2) |
8.44 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalFixDocs Average) |
pass@1 (T = 0.2) |
-1 |
false |
HumanEvalExplain
Task |
Dataset |
Metric |
Value |
Verified |
text - generation |
bigcode/humanevalpack (HumanEvalExplain Python) |
pass@1 (T = 0.2) |
26.46 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalExplain JavaScript) |
pass@1 (T = 0.2) |
17.86 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalExplain Java) |
pass@1 (T = 0.2) |
20.94 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalExplain Go) |
pass@1 (T = 0.2) |
-1 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalExplain C++) |
pass@1 (T = 0.2) |
18.78 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalExplain Rust) |
pass@1 (T = 0.2) |
-1 |
false |
text - generation |
bigcode/humanevalpack (HumanEvalExplain Average) |
pass@1 (T = 0.2) |
-1 |
false |
Other Datasets
Task |
Dataset |
Metric |
Value |
Verified |
text - generation |
mbpp (MBPP) |
pass@1 (T = 0.01) |
31.15 |
false |
text - generation |
ds1000 (DS - 1000 (Overall Completion)) |
pass@1 (T = 0.2) |
10.1 |
false |
text - generation |
nuprl/MultiPL - E (MultiPL - HumanEval (C++)) |
pass@1 (T = 0.2) |
21.61 |
false |
text - generation |
nuprl/MultiPL - E (MultiPL - HumanEval (C#)) |
pass@1 (T = 0.2) |
13.91 |
false |
text - generation |
nuprl/MultiPL - E (MultiPL - HumanEval (D)) |
pass@1 (T = 0.2) |
9.5 |
false |
text - generation |
nuprl/MultiPL - E (MultiPL - HumanEval (Go)) |
pass@1 (T = 0.2) |
53.57 |
false |
text - generation |
nuprl/MultiPL - E (MultiPL - HumanEval (Java)) |
pass@1 (T = 0.2) |
21.58 |
false |
text - generation |
nuprl/MultiPL - E (MultiPL - HumanEval (Julia)) |
pass@1 (T = 0.2) |
13.75 |
false |
text - generation |
nuprl/MultiPL - E (MultiPL - HumanEval (JavaScript)) |
pass@1 (T = 0.2) |
26.88 |
false |
text - generation |
nuprl/MultiPL - E (MultiPL - HumanEval (Lua)) |
pass@1 (T = 0.2) |
15.26 |
false |
text - generation |
nuprl/MultiPL - E (MultiPL - HumanEval (PHP)) |
pass@1 (T = 0.2) |
23.04 |
false |
text - generation |
nuprl/MultiPL - E (MultiPL - HumanEval (Perl)) |
pass@1 (T = 0.2) |
12.1 |
false |
text - generation |
nuprl/MultiPL - E (MultiPL - HumanEval (Python)) |
pass@1 (T = 0.2) |
29.6 |
false |
text - generation |
nuprl/MultiPL - E (MultiPL - HumanEval (R)) |
pass@1 (T = 0.2) |
13.77 |
false |
text - generation |
nuprl/MultiPL - E (MultiPL - HumanEval (Ruby)) |
pass@1 (T = 0.2) |
12.68 |
false |
text - generation |
nuprl/MultiPL - E (MultiPL - HumanEval (Racket)) |
pass@1 (T = 0.2) |
4.29 |
false |
text - generation |
nuprl/MultiPL - E (MultiPL - HumanEval (Rust)) |
pass@1 (T = 0.2) |
19.54 |
false |
text - generation |
nuprl/MultiPL - E (MultiPL - HumanEval (Scala)) |
pass@1 (T = 0.2) |
18.33 |
false |
text - generation |
nuprl/MultiPL - E (MultiPL - HumanEval (Bash)) |
pass@1 (T = 0.2) |
5.7 |
false |
text - generation |
nuprl/MultiPL - E (MultiPL - HumanEval (Swift)) |
pass@1 (T = 0.2) |
17.68 |
false |
text - generation |
nuprl/MultiPL - E (MultiPL - HumanEval (TypeScript)) |
pass@1 (T = 0.2) |
25 |
false |
Model Comparison
Model |
Size |
HumanEval pass@1 |
HumanEval pass@10 |
DeciCoder - 1b |
1b |
19.1% |
|
Refact - 1.6 - fim |
1.6b |
32.0% |
53.0% |
StableCode |
3b |
20.2% |
33.8% |
ReplitCode v1 |
3b |
21.9% |
|
CodeGen2.5 - multi |
7b |
28.4% |
47.5% |
CodeLlama |
7b |
33.5% |
59.6% |
StarCoder |
15b |
33.6% |
|
đ License
The model is licensed under the bigscience - openrail - m
license.

Finally, the model we started training with our blog post is ready đ
After fine - tuning on generated data, it beats Replit 3b, Stability Code 3b and many other models. It almost beats StarCoder ten times the size!
It's likely the best model for practical use in your IDE for code completion because it's smart and fast! You can start using it right now.