Codestral-22B-v0.1免費開源代碼生成模型 - 支持80多種語言代碼生成解釋重構

首頁

Codestral 22B V0.1 Imat GGUF

由qwp4w3hyb開發

Codestral-22B-v0.1是Mistral AI開發的大型代碼生成模型，支持80多種編程語言，適用於代碼生成、解釋和重構任務。

大型語言模型其他開源協議:其他 #多語言代碼生成 #填充中間預測 #編程助手

下載量 362

發布時間 : 5/30/2024

模型概述

該模型在多樣化編程語言數據集上訓練，可作為指導模型或填充中間模型使用，特別適合代碼生成和解釋任務。

模型特點

多語言代碼支持

支持80多種編程語言，包括Python、Java、C/C++、JavaScript等主流語言

雙模式應用

可作為指導模型（問答/生成）或填充中間模型（FIM）使用

優化量化

使用重要性矩陣進行量化以減少精度損失，提供Q_8_0到IQ1_S多種量化類型

模型能力

代碼生成

代碼解釋

代碼重構

填充中間代碼預測

多語言代碼支持

使用案例

開發輔助

代碼生成

根據自然語言描述生成代碼片段

示例：成功生成Rust斐波那契數列函數

代碼補全

在IDE中自動補全代碼片段

示例：正確補全Python函數中間部分

教育

代碼解釋

解釋複雜代碼片段的功能

🚀 Codestral-22B-v0.1-hf-iMat-GGUF-iMat-GGUF

Codestral-22B-v0.1是一個在80多種編程語言的多樣化數據集上訓練的模型，能處理代碼相關的問答、生成等任務，還支持FIM模式，在軟件開發等場景有很大應用價值。

🚀 快速開始

本項目可使用mistralai/Codestral-22B-v0.1結合mistral-inference進行使用，以下是詳細的使用步驟。

✨ 主要特性

包含對初始版本中存在問題的分詞器的修復。
使用重要性矩陣進行量化，以改善量化損失。
從bf16生成ggufs和重要性矩陣，以實現“最優”的精度損失。
廣泛覆蓋從Q_8_0到IQ1_S的不同gguf量化類型。
使用llama.cpp的特定提交版本（2024年5月30日的主分支）進行量化。
使用bartowski的多用途數據集生成重要性矩陣。

📦 安裝指南

推薦使用mistralai/Codestral-22B-v0.1與mistral-inference結合使用，可通過以下命令進行安裝：

pip install mistral_inference

💻 使用示例

基礎用法

下載模型

from huggingface_hub import snapshot_download
from pathlib import Path

mistral_models_path = Path.home().joinpath('mistral_models', 'Codestral-22B-v0.1')
mistral_models_path.mkdir(parents=True, exist_ok=True)

snapshot_download(repo_id="mistralai/Codestral-22B-v0.1", allow_patterns=["params.json", "consolidated.safetensors", "tokenizer.model.v3"], local_dir=mistral_models_path)

對話模式

安裝mistral_inference後，環境中會有mistral-chat CLI命令。

mistral-chat $HOME/mistral_models/Codestral-22B-v0.1 --instruct --max_tokens 256

此命令會對“Write me a function that computes fibonacci in Rust”生成回答，示例輸出如下：

Sure, here's a simple implementation of a function that computes the Fibonacci sequence in Rust. This function takes an integer `n` as an argument and returns the `n`th Fibonacci number.

fn fibonacci(n: u32) -> u32 {
    match n {
        0 => 0,
        1 => 1,
        _ => fibonacci(n - 1) + fibonacci(n - 2),
    }
}

fn main() {
    let n = 10;
    println!("The {}th Fibonacci number is: {}", n, fibonacci(n));
}

This function uses recursion to calculate the Fibonacci number. However, it's not the most efficient solution because it performs a lot of redundant calculations. A more efficient solution would use a loop to iteratively calculate the Fibonacci numbers.

高級用法

填充中間內容（FIM）

安裝mistral_inference並運行pip install --upgrade mistral_common確保安裝mistral_common>=1.2後：

from mistral_inference.model import Transformer
from mistral_inference.generate import generate
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.tokens.instruct.request import FIMRequest

tokenizer = MistralTokenizer.v3()
model = Transformer.from_folder("~/codestral-22B-240529")

prefix = """def add("""
suffix = """    return sum"""

request = FIMRequest(prompt=prefix, suffix=suffix)

tokens = tokenizer.encode_fim(request).tokens

out_tokens, _ = generate([tokens], model, max_tokens=256, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)
result = tokenizer.decode(out_tokens[0])

middle = result.split(suffix)[0].strip()
print(middle)

示例輸出如下：

num1, num2):

    # Add two numbers
    sum = num1 + num2

    # return the sum

📚 詳細文檔

侷限性

Codestral-22B-v0.1沒有任何審核機制。我們期待與社區合作，探索使模型更好地遵守規則的方法，以便在需要審核輸出的環境中部署。

許可證

Codestral-22B-v0.1根據MNLP-0.1許可證發佈。

開發團隊

Albert Jiang, Alexandre Sablayrolles, Alexis Tacnet, Antoine Roux, Arthur Mensch, Audrey Herblin-Stoop, Baptiste Bout, Baudouin de Monicault, Blanche Savary, Bam4d, Caroline Feldman, Devendra Singh Chaplot, Diego de las Casas, Eleonore Arcelin, Emma Bou Hanna, Etienne Metzger, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Harizo Rajaona, Henri Roussez, Jean-Malo Delignon, Jia Li, Justus Murke, Kartik Khandelwal, Lawrence Stewart, Louis Martin, Louis Ternon, Lucile Saulnier, Lélio Renard Lavaud, Margaret Jennings, Marie Pellat, Marie Torelli, Marie-Anne Lachaux, Marjorie Janiewicz, Mickael Seznec, Nicolas Schuhl, Patrick von Platen, Romain Sauvestre, Pierre Stock, Sandeep Subramanian, Saurabh Garg, Sophia Yang, Szymon Antoniak, Teven Le Scao, Thibaut Lavril, Thibault Schueller, Timothée Lacroix, Théophile Gervet, Thomas Wang, Valera Nemychnikova, Wendy Shang, William El Sayed, William Marshall