Aixcoder-7B-Base Open-Source Code AI Model - Free Deployment to Boost Code Completion and Generation

Aixcoder 7b Base

Developed by aiXcoder

aiXcoder-7B Base is a 7B-parameter large language model focused on code completion and generation tasks, trained on 1.2T unique code data, demonstrating excellent performance in real-world development scenarios.

Large Language Model

Transformers

#Code Completion Optimization #Multilingual Code Generation #Large Parameter Code Model

Downloads 99

Release Time : 4/8/2024

Model Overview

A foundational model optimized for code completion and generation, supporting multiple programming languages and delivering advanced performance in code-related tasks.

Model Features

Real-world Scenario Optimization

Pre-training tasks and contextual information are uniquely designed for real-world code generation scenarios.

Efficient Code Completion

Delivers the best code completion performance among models of the same parameter scale.

Multilingual Support

Outperforms larger-scale models in multilingual nl2code benchmark tests.

Model Capabilities

Automatic code completion

Code generation

Code understanding

Multilingual code processing

Use Cases

Development Assistance

IDE Smart Completion

Provides context-aware code completion in IDEs like VS Code/JetBrains

Improves development efficiency and reduces coding errors

Algorithm Implementation

Generates complete algorithm implementations from natural language descriptions

Correct generation of common algorithms like quicksort

🚀 aiXcoder-7B Code Large Language Model

aiXcoder-7B is a code large language model capable of understanding and generating code in multiple programming languages, offering high - performance solutions for code - related tasks.

🚀 Quick Start

✨ Features

As the capabilities of large code models are gradually being unearthed, aiXcoder has consistently pondered on how to make these models more beneficial in real development scenarios. The open - sourced aiXcoder 7B Base has undergone extensive training on 1.2T Unique Tokens, with its pre - training tasks and contextual information uniquely designed for real - world code generation contexts.

Code Completion Excellence: Among all models of similar parameter sizes, aiXcoder 7B Base stands out as the most effective model in code completion scenarios.
Multilingual nl2code Benchmark Performance: It surpasses mainstream models like codellama 34B and StarCoder2 15B in the average performance on the multilingual nl2code benchmark.
Foundational Focus: The current version is a foundational model that focuses on improving the efficiency and accuracy of code completion and code generation tasks.

📦 Installation

Environment Requirements

Option 1: Build Env

To run the model inference code, you'll need the following environment setup:

Python 3.8 or higher
PyTorch 2.1.0 or higher
sentencepiece 0.2.0 or higher
transformers 4.34.1 or higher (if run inference by transformers library)

Please ensure all dependencies are installed using the following command:

conda create -n aixcoder-7b python=3.11
conda activate aixcoder-7b
git clone git@github.com:aixcoder-plugin/aiXcoder-7b.git
cd aiXcoder-7b
pip install -r requirements.txt

requirements.txt listed all necessary libraries and their versions.

To achieve faster inference speeds, especially for large models, we recommend installing flash attention. Flash attention is an optimized attention mechanism that significantly reduces computation time for transformer - based models without sacrificing accuracy.

Before proceeding, ensure your environment meets the CUDA requirements as flash attention leverages GPU acceleration. Follow these steps to install flash attention:

git clone git@github.com:Dao-AILab/flash-attention.git
cd flash-attention
MAX_JOBS=8 python setup.py install

Option 2: Docker

For a consistent and isolated environment, we recommend running the model inference code using Docker. Here's how to set up and use Docker for our model:

Install Docker: If you haven't already, install Docker on your machine.
Pull the Docker Image: Pull the Docker image from Docker Hub.

docker pull pytorch/pytorch:2.1.0-cuda11.8-cudnn8-devel

Run the Container: Once the image is pulled, you can run the model inside a Docker container.

docker run --gpus all -it -v /dev/shm:/dev/shm --name aix_instance pytorch/pytorch:2.1.0-cuda11.8-cudnn8-devel /bin/bash
pip install sentencepiece
git clone git@github.com:aixcoder-plugin/aiXcoder-7b.git
cd aiXcoder-7b

This command starts a container named aix_instance from the pytorch image. You can interact with the model inside this container.

To achieve faster inference speeds, especially for large models, we recommend installing flash attention.

git clone git@github.com:Dao-AILab/flash-attention.git
cd flash-attention
MAX_JOBS=8 python setup.py install

Model Inference: Within the Docker container, you can run the model inference code as described in the Inference Example section.

Using Docker provides a clean, controlled environment that minimizes issues related to software versions and dependencies.

Model Weights

You can download the model weights from the following link:

aiXcoder Base Download
aiXcoder Instruct Download (Comming soon...)

💻 Usage Examples

Basic Usage

Command Line Execution

For a quick start, you can run the model inference directly from the command line:

torchrun --nproc_per_node 1 sess_megatron.py --model_dir "path/to/model_weights_dir"

Replace "path/to/model_weights_dir" with the actual path to your downloaded model weights.

or run inference with huggingface's transformers:

python sess_huggingface.py

Python Script Execution

Alternatively, you can invoke the model programmatically within your Python scripts. This method provides more flexibility for integrating the model into your applications or workflows. Here's a simple example on how to do it:

from sess_megatron import TestInference

infer = TestInference()
res = infer.run_infer(
    # for FIM style input, code_string stands for prefix context
    code_string="""# 快速排序算法""", 
    # for FIM style input, later_code stands for suffix context
    later_code="\n",
    # file_path should be a path from project to file
    file_path="test.py",
    # max num for generated tokens
    max_new_tokens=256,
)
print(res)

"""output:

def quick_sort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[0]
    less = [i for i in arr[1:] if i <= pivot]
    greater = [i for i in arr[1:] if i > pivot]
    return quick_sort(less) + [pivot] + quick_sort(greater)


# 测试
arr = [3, 2, 1, 4, 5]
print(quick_sort(arr))  # [1, 2, 3, 4, 5]
"""

import torch
import sys
from hf_mini.utils import input_wrapper
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "cuda" # the device to load the model onto

tokenizer = AutoTokenizer.from_pretrained("aiXcoder/aixcoder-7b-base")
model = AutoModelForCausalLM.from_pretrained("aiXcoder/aixcoder-7b-base", torch_dtype=torch.bfloat16)


text = input_wrapper(
    # for FIM style input, code_string stands for prefix context
    code_string="# 快速排序算法",
    # for FIM style input, later_code stands for suffix context
    later_code="\n# 测试\narr = [3, 2, 1, 4, 5]\nprint(quick_sort(arr))  # [1, 2, 3, 4, 5]",
    # file_path should be a path from project to file
    path="test.py"
)

if len(text) == 0:
    sys.exit()

inputs = tokenizer(text, return_tensors="pt", return_token_type_ids=False)

inputs = inputs.to(device)
model.to(device)

outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=False))

"""output:
def quick_sort(arr):
    # 如果数组长度小于等于1，直接返回
    if len(arr) <= 1:
        return arr
    # 选择数组的第一个元素作为基准
    pivot = arr[0]
    # 初始化左右指针
    left, right = 1, len(arr) - 1
    # 循环直到左指针小于右指针
    while left < right:
        # 从右到左找到第一个小于基准的元素，与左指针元素交换
        if arr[right] < pivot:
            arr[left], arr[right] = arr[right], arr[left]
            left += 1
        # 从左到右找到第一个大于等于基准的元素，与右指针元素交换
        if arr[left] >= pivot:
            right -= 1
    # 将基准元素与左指针元素交换
    arr[left], arr[0] = arr[0], arr[left]
    # 对左半部分进行递归排序
    quick_sort(arr[:left])
    # 对右半部分进行递归排序
    quick_sort(arr[left + 1:])
    return arr</s>
"""

📄 License

The model weights are licensed under the Model License for academic research use; for commercial use, please apply by sending an email to support@aiXcoder.com.

🔗 Acknowledgments

We would like to thank all contributors to the open - source projects and datasets that made this work possible.

Thank you for your interest in our Code Large Language Model. We look forward to your contributions and feedback!

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご