Refact-1_6B-fim Open Source Code Generation Model - Free Deployment, Excellent Performance in Multi-language Programming

Refact 1 6B Fim

Developed by smallcloudai

Refact-1.6B is a large language model with 1.6B parameters focused on code generation, excelling in multiple programming languages.

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Openrail #Multilingual code generation #High success rate code completion #Open-source code optimization

Downloads 9,703

Release Time : 8/29/2023

Model Overview

This model is primarily used for code generation tasks, supporting various programming languages and demonstrating excellent performance in benchmarks like HumanEval.

Model Features

Multilingual code generation

Supports code generation in multiple programming languages including Python, JavaScript, Java, and C++.

High performance

Demonstrates outstanding performance in the HumanEval benchmark, achieving a Python pass@1 rate of 32%.

Extensive training data

Utilizes diverse training data sources including GitHub code, technical forums, and Wikipedia.

Model Capabilities

Code autocompletion

Function generation

Multilingual code conversion

Code explanation

Use Cases

Development assistance

Code autocompletion

Provides intelligent code completion suggestions in IDEs

Improves development efficiency

Code generation

Generates complete code implementations based on function signatures or comments

Achieves a HumanEval Python pass@1 rate of 32%

Education

Programming learning aid

Provides learners with code examples and explanations

🚀 Refact-1.6B

This is a text generation model. After fine - tuning on generated data, it outperforms many other models such as Replit 3b and Stability Code 3b, and almost rivals StarCoder which is ten times its size. It's likely the best model for practical code completion use in your IDE due to its intelligence and speed.

✨ Features

Capable of text generation, especially useful for code - related tasks.
Outperforms several models in terms of HumanEval pass@1 and pass@10 metrics despite its relatively small size.

📚 Documentation

Model Information

Property	Details
Pipeline Tag	text - generation
Inference	true
Library Name	transformers
Tags	code

Datasets

Pretrain Datasets

books
arxiv
c4
falcon - refinedweb
wiki
github - issues
stack_markdown
self - made dataset of permissive github code

Training Datasets

bigcode/the - stack - dedup
rombodawg/2XUNCENSORED_MegaCodeTraining188k
bigcode/commitpackft

Metrics

The model is evaluated using the code_eval metric.

Model Index

The model named Refact - 1.6B has the following evaluation results on different tasks and datasets:

HumanEval

Task	Dataset	Metric	Value	Verified
text - generation	openai_humaneval (HumanEval)	pass@1 (T = 0.01)	32.0	false
text - generation	openai_humaneval (HumanEval)	pass@1 (T = 0.2)	31.5	false
text - generation	openai_humaneval (HumanEval)	pass@10 (T = 0.8)	53.0	false
text - generation	openai_humaneval (HumanEval)	pass@100 (T = 0.8)	76.9	false

HumanEvalSynthesize

Task	Dataset	Metric	Value	Verified
text - generation	bigcode/humanevalpack (HumanEvalSynthesize Python)	pass@1 (T = 0.2)	35.8	false
text - generation	bigcode/humanevalpack (HumanEvalSynthesize JavaScript)	pass@1 (T = 0.2)	31.6	false
text - generation	bigcode/humanevalpack (HumanEvalSynthesize Java)	pass@1 (T = 0.2)	29.1	false
text - generation	bigcode/humanevalpack (HumanEvalSynthesize Go)	pass@1 (T = 0.2)	-1	false
text - generation	bigcode/humanevalpack (HumanEvalSynthesize C++)	pass@1 (T = 0.2)	26.3	false
text - generation	bigcode/humanevalpack (HumanEvalSynthesize Rust)	pass@1 (T = 0.2)	-1	false
text - generation	bigcode/humanevalpack (HumanEvalSynthesize Average)	pass@1 (T = 0.2)	-1	false

HumanEvalFixTests

Task	Dataset	Metric	Value	Verified
text - generation	bigcode/humanevalpack (HumanEvalFixTests Python)	pass@1 (T = 0.2)	18.38	false
text - generation	bigcode/humanevalpack (HumanEvalFixTests JavaScript)	pass@1 (T = 0.2)	12.28	false
text - generation	bigcode/humanevalpack (HumanEvalFixTests Java)	pass@1 (T = 0.2)	15.12	false
text - generation	bigcode/humanevalpack (HumanEvalFixTests Go)	pass@1 (T = 0.2)	-1	false
text - generation	bigcode/humanevalpack (HumanEvalFixTests C++)	pass@1 (T = 0.2)	13.17	false
text - generation	bigcode/humanevalpack (HumanEvalFixTests Rust)	pass@1 (T = 0.2)	2.8	false
text - generation	bigcode/humanevalpack (HumanEvalFixTests Average)	pass@1 (T = 0.2)	-1	false

HumanEvalFixDocs

Task	Dataset	Metric	Value	Verified
text - generation	bigcode/humanevalpack (HumanEvalFixDocs Python)	pass@1 (T = 0.2)	26.92	false
text - generation	bigcode/humanevalpack (HumanEvalFixDocs JavaScript)	pass@1 (T = 0.2)	26.85	false
text - generation	bigcode/humanevalpack (HumanEvalFixDocs Java)	pass@1 (T = 0.2)	30.76	false
text - generation	bigcode/humanevalpack (HumanEvalFixDocs Go)	pass@1 (T = 0.2)	-1	false
text - generation	bigcode/humanevalpack (HumanEvalFixDocs C++)	pass@1 (T = 0.2)	25.94	false
text - generation	bigcode/humanevalpack (HumanEvalFixDocs Rust)	pass@1 (T = 0.2)	8.44	false
text - generation	bigcode/humanevalpack (HumanEvalFixDocs Average)	pass@1 (T = 0.2)	-1	false

HumanEvalExplain

Task	Dataset	Metric	Value	Verified
text - generation	bigcode/humanevalpack (HumanEvalExplain Python)	pass@1 (T = 0.2)	26.46	false
text - generation	bigcode/humanevalpack (HumanEvalExplain JavaScript)	pass@1 (T = 0.2)	17.86	false
text - generation	bigcode/humanevalpack (HumanEvalExplain Java)	pass@1 (T = 0.2)	20.94	false
text - generation	bigcode/humanevalpack (HumanEvalExplain Go)	pass@1 (T = 0.2)	-1	false
text - generation	bigcode/humanevalpack (HumanEvalExplain C++)	pass@1 (T = 0.2)	18.78	false
text - generation	bigcode/humanevalpack (HumanEvalExplain Rust)	pass@1 (T = 0.2)	-1	false
text - generation	bigcode/humanevalpack (HumanEvalExplain Average)	pass@1 (T = 0.2)	-1	false

Other Datasets

Task	Dataset	Metric	Value	Verified
text - generation	mbpp (MBPP)	pass@1 (T = 0.01)	31.15	false
text - generation	ds1000 (DS - 1000 (Overall Completion))	pass@1 (T = 0.2)	10.1	false
text - generation	nuprl/MultiPL - E (MultiPL - HumanEval (C++))	pass@1 (T = 0.2)	21.61	false
text - generation	nuprl/MultiPL - E (MultiPL - HumanEval (C#))	pass@1 (T = 0.2)	13.91	false
text - generation	nuprl/MultiPL - E (MultiPL - HumanEval (D))	pass@1 (T = 0.2)	9.5	false
text - generation	nuprl/MultiPL - E (MultiPL - HumanEval (Go))	pass@1 (T = 0.2)	53.57	false
text - generation	nuprl/MultiPL - E (MultiPL - HumanEval (Java))	pass@1 (T = 0.2)	21.58	false
text - generation	nuprl/MultiPL - E (MultiPL - HumanEval (Julia))	pass@1 (T = 0.2)	13.75	false
text - generation	nuprl/MultiPL - E (MultiPL - HumanEval (JavaScript))	pass@1 (T = 0.2)	26.88	false
text - generation	nuprl/MultiPL - E (MultiPL - HumanEval (Lua))	pass@1 (T = 0.2)	15.26	false
text - generation	nuprl/MultiPL - E (MultiPL - HumanEval (PHP))	pass@1 (T = 0.2)	23.04	false
text - generation	nuprl/MultiPL - E (MultiPL - HumanEval (Perl))	pass@1 (T = 0.2)	12.1	false
text - generation	nuprl/MultiPL - E (MultiPL - HumanEval (Python))	pass@1 (T = 0.2)	29.6	false
text - generation	nuprl/MultiPL - E (MultiPL - HumanEval (R))	pass@1 (T = 0.2)	13.77	false
text - generation	nuprl/MultiPL - E (MultiPL - HumanEval (Ruby))	pass@1 (T = 0.2)	12.68	false
text - generation	nuprl/MultiPL - E (MultiPL - HumanEval (Racket))	pass@1 (T = 0.2)	4.29	false
text - generation	nuprl/MultiPL - E (MultiPL - HumanEval (Rust))	pass@1 (T = 0.2)	19.54	false
text - generation	nuprl/MultiPL - E (MultiPL - HumanEval (Scala))	pass@1 (T = 0.2)	18.33	false
text - generation	nuprl/MultiPL - E (MultiPL - HumanEval (Bash))	pass@1 (T = 0.2)	5.7	false
text - generation	nuprl/MultiPL - E (MultiPL - HumanEval (Swift))	pass@1 (T = 0.2)	17.68	false
text - generation	nuprl/MultiPL - E (MultiPL - HumanEval (TypeScript))	pass@1 (T = 0.2)	25	false

Model Comparison

Model	Size	HumanEval pass@1	HumanEval pass@10
DeciCoder - 1b	1b	19.1%
Refact - 1.6 - fim	1.6b	32.0%	53.0%
StableCode	3b	20.2%	33.8%
ReplitCode v1	3b	21.9%
CodeGen2.5 - multi	7b	28.4%	47.5%
CodeLlama	7b	33.5%	59.6%
StarCoder	15b	33.6%

📄 License

The model is licensed under the bigscience - openrail - m license.

image/png

Finally, the model we started training with our blog post is ready 🎉

After fine - tuning on generated data, it beats Replit 3b, Stability Code 3b and many other models. It almost beats StarCoder ten times the size!

It's likely the best model for practical use in your IDE for code completion because it's smart and fast! You can start using it right now.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご