Codestral-22B-v0.1免费开源代码生成模型 - 支持80多种语言代码生成解释重构

首页

Codestral 22B V0.1 Imat GGUF

由 qwp4w3hyb 开发

Codestral-22B-v0.1是Mistral AI开发的大型代码生成模型，支持80多种编程语言，适用于代码生成、解释和重构任务。

大型语言模型其他开源协议:其他 #多语言代码生成 #填充中间预测 #编程助手

下载量 362

发布时间 : 5/30/2024

模型简介

该模型在多样化编程语言数据集上训练，可作为指导模型或填充中间模型使用，特别适合代码生成和解释任务。

模型特点

多语言代码支持

支持80多种编程语言，包括Python、Java、C/C++、JavaScript等主流语言

双模式应用

可作为指导模型（问答/生成）或填充中间模型（FIM）使用

优化量化

使用重要性矩阵进行量化以减少精度损失，提供Q_8_0到IQ1_S多种量化类型

模型能力

代码生成

代码解释

代码重构

填充中间代码预测

多语言代码支持

使用案例

开发辅助

代码生成

根据自然语言描述生成代码片段

示例：成功生成Rust斐波那契数列函数

代码补全

在IDE中自动补全代码片段

示例：正确补全Python函数中间部分

教育

代码解释

解释复杂代码片段的功能

🚀 Codestral-22B-v0.1-hf-iMat-GGUF-iMat-GGUF

Codestral-22B-v0.1是一个在80多种编程语言的多样化数据集上训练的模型，能处理代码相关的问答、生成等任务，还支持FIM模式，在软件开发等场景有很大应用价值。

🚀 快速开始

本项目可使用mistralai/Codestral-22B-v0.1结合mistral-inference进行使用，以下是详细的使用步骤。

✨ 主要特性

包含对初始版本中存在问题的分词器的修复。
使用重要性矩阵进行量化，以改善量化损失。
从bf16生成ggufs和重要性矩阵，以实现“最优”的精度损失。
广泛覆盖从Q_8_0到IQ1_S的不同gguf量化类型。
使用llama.cpp的特定提交版本（2024年5月30日的主分支）进行量化。
使用bartowski的多用途数据集生成重要性矩阵。

📦 安装指南

推荐使用mistralai/Codestral-22B-v0.1与mistral-inference结合使用，可通过以下命令进行安装：

pip install mistral_inference

💻 使用示例

基础用法

下载模型

from huggingface_hub import snapshot_download
from pathlib import Path

mistral_models_path = Path.home().joinpath('mistral_models', 'Codestral-22B-v0.1')
mistral_models_path.mkdir(parents=True, exist_ok=True)

snapshot_download(repo_id="mistralai/Codestral-22B-v0.1", allow_patterns=["params.json", "consolidated.safetensors", "tokenizer.model.v3"], local_dir=mistral_models_path)

对话模式

安装mistral_inference后，环境中会有mistral-chat CLI命令。

mistral-chat $HOME/mistral_models/Codestral-22B-v0.1 --instruct --max_tokens 256

此命令会对“Write me a function that computes fibonacci in Rust”生成回答，示例输出如下：

Sure, here's a simple implementation of a function that computes the Fibonacci sequence in Rust. This function takes an integer `n` as an argument and returns the `n`th Fibonacci number.

fn fibonacci(n: u32) -> u32 {
    match n {
        0 => 0,
        1 => 1,
        _ => fibonacci(n - 1) + fibonacci(n - 2),
    }
}

fn main() {
    let n = 10;
    println!("The {}th Fibonacci number is: {}", n, fibonacci(n));
}

This function uses recursion to calculate the Fibonacci number. However, it's not the most efficient solution because it performs a lot of redundant calculations. A more efficient solution would use a loop to iteratively calculate the Fibonacci numbers.

高级用法

填充中间内容（FIM）

安装mistral_inference并运行pip install --upgrade mistral_common确保安装mistral_common>=1.2后：

from mistral_inference.model import Transformer
from mistral_inference.generate import generate
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.tokens.instruct.request import FIMRequest

tokenizer = MistralTokenizer.v3()
model = Transformer.from_folder("~/codestral-22B-240529")

prefix = """def add("""
suffix = """    return sum"""

request = FIMRequest(prompt=prefix, suffix=suffix)

tokens = tokenizer.encode_fim(request).tokens

out_tokens, _ = generate([tokens], model, max_tokens=256, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)
result = tokenizer.decode(out_tokens[0])

middle = result.split(suffix)[0].strip()
print(middle)

示例输出如下：

num1, num2):

    # Add two numbers
    sum = num1 + num2

    # return the sum

📚 详细文档

局限性

Codestral-22B-v0.1没有任何审核机制。我们期待与社区合作，探索使模型更好地遵守规则的方法，以便在需要审核输出的环境中部署。

许可证

Codestral-22B-v0.1根据MNLP-0.1许可证发布。

开发团队

Albert Jiang, Alexandre Sablayrolles, Alexis Tacnet, Antoine Roux, Arthur Mensch, Audrey Herblin-Stoop, Baptiste Bout, Baudouin de Monicault, Blanche Savary, Bam4d, Caroline Feldman, Devendra Singh Chaplot, Diego de las Casas, Eleonore Arcelin, Emma Bou Hanna, Etienne Metzger, Gianna Lengyel, Guillaume Bour, Guillaume Lample, Harizo Rajaona, Henri Roussez, Jean-Malo Delignon, Jia Li, Justus Murke, Kartik Khandelwal, Lawrence Stewart, Louis Martin, Louis Ternon, Lucile Saulnier, Lélio Renard Lavaud, Margaret Jennings, Marie Pellat, Marie Torelli, Marie-Anne Lachaux, Marjorie Janiewicz, Mickael Seznec, Nicolas Schuhl, Patrick von Platen, Romain Sauvestre, Pierre Stock, Sandeep Subramanian, Saurabh Garg, Sophia Yang, Szymon Antoniak, Teven Le Scao, Thibaut Lavril, Thibault Schueller, Timothée Lacroix, Théophile Gervet, Thomas Wang, Valera Nemychnikova, Wendy Shang, William El Sayed, William Marshall