Devstral-Small-2507-GGUF开源大语言模型 - 助力软件工程，支持工具调用和多文件编辑

首页

Devstral Small 2507 GGUF

由 unsloth 开发

Devstral 1.1是专为软件工程任务设计的大语言模型，支持工具调用和视觉功能，适合代码库探索和多文件编辑。

大型语言模型支持多种语言开源协议:Apache-2.0 #智能编码代理 #多语言软件工程 #128k长上下文

下载量 16.16k

发布时间 : 7/10/2025

模型简介

Devstral Small 1.1是一款轻量级大语言模型，专为智能编码任务设计，支持多语言和工具调用，适合本地部署和设备端使用。

模型特点

智能编码

专为智能编码任务设计，是软件工程代理的理想选择。

轻量级

仅240亿参数，可在单张RTX 4090或32GB RAM的Mac上运行，适合本地部署和设备端使用。

开源许可

采用Apache 2.0许可证，商业和非商业用途均可使用和修改。

长上下文窗口

支持128k上下文窗口，适合处理长文本和复杂任务。

工具调用支持

支持工具调用，可高效探索代码库和编辑多文件。

模型能力

文本生成

代码生成

代码编辑

多语言支持

工具调用

使用案例

软件开发

代码库分析

分析代码库的测试覆盖率并生成可视化图表。

生成覆盖率分布图、饼图和总结图。

游戏开发

开发融合《太空侵略者》和《乒乓》的网页视频游戏。

创建具有双玩家控制和侵略者射击机制的游戏。

🚀 Devstral Small 1.1

Devstral 1.1是一款专为软件工程任务打造的大语言模型，支持工具调用，还可选择启用视觉功能。它能借助工具高效探索代码库、编辑多文件，为软件工程代理提供强大助力。

支持语言

英语、法语、德语、西班牙语、葡萄牙语、意大利语、日语、韩语、俄语、中文、阿拉伯语、波斯语、印尼语、马来语、尼泊尔语、波兰语、罗马尼亚语、塞尔维亚语、瑞典语、土耳其语、乌克兰语、越南语、印地语、孟加拉语

许可证

本项目采用Apache-2.0许可证。

重要提示

在llama.cpp中，你需要使用--jinja来启用系统提示。

免费微调与更多资源

免费微调：使用我们的Google Colab笔记本免费微调Mistral v0.3 (7B)。
博客文章：了解Devstral 1.1支持详情
其他笔记本：查看更多

✨ 主要特性

智能编码：专为智能编码任务设计，是软件工程代理的理想选择。
轻量级：仅240亿参数，可在单张RTX 4090或32GB RAM的Mac上运行，适合本地部署和设备端使用。
开源许可：采用Apache 2.0许可证，商业和非商业用途均可使用和修改。
上下文窗口：支持128k上下文窗口。
分词器：使用Tekken分词器，词汇量达131k。

📦 安装指南

API使用

创建Mistral账户并获取API密钥：按照说明操作。
启动OpenHands Docker容器：

export MISTRAL_API_KEY=<MY_KEY>

mkdir -p ~/.openhands && echo '{"language":"en","agent":"CodeActAgent","max_iterations":null,"security_analyzer":null,"confirmation_mode":false,"llm_model":"mistral/devstral-small-2507","llm_api_key":"'$MISTRAL_API_KEY'","remote_runtime_resource_factor":null,"github_token":null,"enable_default_condenser":true}' > ~/.openhands-state/settings.json

docker pull docker.all-hands.dev/all-hands-ai/runtime:0.48-nikolaik

docker run -it --rm --pull=always \
    -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.48-nikolaik \
    -e LOG_ALL_EVENTS=true \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v ~/.openhands:/.openhands \
    -p 3000:3000 \
    --add-host host.docker.internal:host-gateway \
    --name openhands-app \
    docker.all-hands.dev/all-hands-ai/openhands:0.48

本地推理

本模型可通过以下库进行部署：

vLLM（推荐）：点击查看详情
mistral-inference：点击查看详情
transformers：点击查看详情
LMStudio：点击查看详情
llama.cpp：点击查看详情
ollama：点击查看详情

vLLM（推荐）

展开查看

我们推荐使用vLLM库实现生产级推理管道。

安装：确保安装vLLM >= 0.9.1和mistral_common >= 1.7.0：

pip install vllm --upgrade
pip install mistral-common --upgrade

检查安装：

python -c "import mistral_common; print(mistral_common.__version__)"

你也可以使用Docker镜像或Docker Hub上的镜像。

启动服务器：建议在服务器/客户端模式下使用Devstral。

启动服务器：

vllm serve mistralai/Devstral-Small-2507 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --tensor-parallel-size 2

使用Python代码测试客户端：

import requests
import json
from huggingface_hub import hf_hub_download

url = "http://<your-server-url>:8000/v1/chat/completions"
headers = {"Content-Type": "application/json", "Authorization": "Bearer token"}

model = "mistralai/Devstral-Small-2507"

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt(model, "SYSTEM_PROMPT.txt")

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "<your-command>",
            },
        ],
    },
]

data = {"model": model, "messages": messages, "temperature": 0.15}

# Devstral Small 1.1支持工具调用。若要使用工具，请按以下操作：
# tools = [ # 为vLLM定义工具
#     {
#         "type": "function",
#         "function": {
#             "name": "git_clone",
#             "description": "克隆一个git仓库",
#             "parameters": {
#                 "type": "object",
#                 "properties": {
#                     "url": {
#                         "type": "string",
#                         "description": "git仓库的URL",
#                     },
#                 },
#                 "required": ["url"],
#             },
#         },
#     }
# ] 
# data = {"model": model, "messages": messages, "temperature": 0.15, "tools": tools} # 将工具传递给负载

response = requests.post(url, headers=headers, data=json.dumps(data))
print(response.json()["choices"][0]["message"]["content"])

Mistral-inference

展开查看

推荐使用mistral-inference快速试用Devstral。

安装：确保安装mistral_inference >= 1.6.0：

pip install mistral_inference --upgrade

下载模型：

from huggingface_hub import snapshot_download
from pathlib import Path

mistral_models_path = Path.home().joinpath('mistral_models', 'Devstral')
mistral_models_path.mkdir(parents=True, exist_ok=True)

snapshot_download(repo_id="mistralai/Devstral-Small-2507", allow_patterns=["params.json", "consolidated.safetensors", "tekken.json"], local_dir=mistral_models_path)

启动聊天：

mistral-chat $HOME/mistral_models/Devstral --instruct --max_tokens 300

Transformers

展开查看

若要使用transformers库充分发挥模型性能，需安装mistral-common >= 1.7.0以使用我们的分词器：

pip install mistral-common --upgrade

加载分词器和模型并生成文本：

import torch

from mistral_common.protocol.instruct.messages import (
    SystemMessage, UserMessage
)
from mistral_common.protocol.instruct.request import ChatCompletionRequest
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from huggingface_hub import hf_hub_download
from transformers import AutoModelForCausalLM

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

model_id = "mistralai/Devstral-Small-2507"
SYSTEM_PROMPT = load_system_prompt(model_id, "SYSTEM_PROMPT.txt")

tokenizer = MistralTokenizer.from_hf_hub(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

tokenized = tokenizer.encode_chat_completion(
    ChatCompletionRequest(
        messages=[
            SystemMessage(content=SYSTEM_PROMPT),
            UserMessage(content="<your-command>"),
        ],
    )
)

output = model.generate(
    input_ids=torch.tensor([tokenized.tokens]),
    max_new_tokens=1000,
)[0]

decoded_output = tokenizer.decode(output[len(tokenized.tokens):])
print(decoded_output)

LM Studio

展开查看

从以下地址下载模型权重：

LM Studio GGUF仓库（推荐）：点击下载
我们的GGUF仓库：点击下载

pip install -U "huggingface_hub[cli]"
huggingface-cli download \
"lmstudio-community/Devstral-Small-2507-GGUF" \ # 或 mistralai/Devstral-Small-2507_gguf
--include "Devstral-Small-2507-Q4_K_M.gguf" \
--local-dir "Devstral-Small-2507_gguf/"

使用LMStudio本地部署模型：

下载并安装LM Studio。
安装lms cli：~/.lmstudio/bin/lms bootstrap。
在bash终端中，进入模型权重下载目录，运行lms import Devstral-Small-2507-Q4_K_M.gguf。
打开LM Studio应用，点击终端图标进入开发者模式。选择加载模型Devstral Small 2507，切换状态按钮启动模型，在设置中开启“在本地网络上服务”。
记录右侧标签中的API标识符（devstral-small-2507）和API地址，供OpenHands或Cline使用。

llama.cpp

展开查看

从Hugging Face下载模型权重：

pip install -U "huggingface_hub[cli]"
huggingface-cli download \
"mistralai/Devstral-Small-2507_gguf" \
--include "Devstral-Small-2507-Q4_K_M.gguf" \
--local-dir "mistralai/Devstral-Small-2507_gguf/"

使用llama.cpp服务器运行Devstral：

./llama-server -m mistralai/Devstral-Small-2507_gguf/Devstral-Small-2507-Q4_K_M.gguf -c 0 # -c配置上下文大小，0表示使用模型默认值（此处为128k）。

OpenHands（推荐）

启动服务器部署Devstral Small 1.1

确保按上述说明启动了兼容OpenAI的服务器（如vLLM或Ollama），然后使用OpenHands与Devstral Small 1.1进行交互。

例如，启动vLLM服务器：

vllm serve mistralai/Devstral-Small-2507 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --tensor-parallel-size 2

服务器地址格式为：http://<your-server-url>:8000/v1

启动OpenHands

按照安装指南安装OpenHands。

使用Docker镜像启动OpenHands：

docker pull docker.all-hands.dev/all-hands-ai/runtime:0.48-nikolaik

docker run -it --rm --pull=always \
    -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.48-nikolaik \
    -e LOG_ALL_EVENTS=true \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v ~/.openhands:/.openhands \
    -p 3000:3000 \
    --add-host host.docker.internal:host-gateway \
    --name openhands-app \
    docker.all-hands.dev/all-hands-ai/openhands:0.48

通过http://localhost:3000访问OpenHands UI。

连接服务器

访问OpenHands UI时，会提示连接服务器。可使用高级模式连接之前启动的服务器，填写以下字段：

自定义模型：openai/mistralai/Devstral-Small-2507
基础URL：http://<your-server-url>:8000/v1
API密钥：token（或启动服务器时使用的其他令牌）

查看设置

![OpenHands设置](assets/open_hands_config.png)

Cline

启动服务器部署Devstral Small 1.1

确保按上述说明启动了兼容OpenAI的服务器（如vLLM或Ollama），然后使用OpenHands与Devstral Small 1.1进行交互。

例如，启动vLLM服务器：

vllm serve mistralai/Devstral-Small-2507 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --tensor-parallel-size 2

服务器地址格式为：http://<your-server-url>:8000/v1

启动Cline

按照安装指南安装Cline，然后在设置中配置服务器地址。

查看设置

![Cline设置](assets/cline_config.png)

💻 使用示例

OpenHands：分析Mistral Common测试覆盖率

启动OpenHands脚手架并关联仓库，分析测试覆盖率并找出覆盖不足的文件。以公共mistral-common仓库为例：挂载仓库到工作区后，输入以下指令：

检查仓库的测试覆盖率，然后创建测试覆盖率可视化图表。尝试绘制几种不同类型的图表，并保存为png文件。

代理将首先浏览代码库，检查测试配置和结构： mistral common coverage - prompt 然后设置测试依赖并启动覆盖率测试： mistral common coverage - dependencies 最后，代理编写代码可视化覆盖率，导出结果并保存图表为png文件： mistral common coverage - visualization 运行结束后，将生成以下图表： mistral common coverage - coverage distribution 模型还能解释结果： mistral common coverage - navigate

Cline：开发视频游戏

在VSCode中初始化Cline并连接到之前启动的服务器，输入以下指令开发视频游戏：

创建一个融合了《太空侵略者》和《乒乓》的网页视频游戏。

遵循以下规则：
- 有两名玩家，分别位于屏幕顶部和底部，通过控制横杆反弹球。
- 第一名玩家使用“a”和“d”键控制，第二名玩家使用左右箭头键控制。
- 侵略者位于屏幕中央，外观类似《太空侵略者》中的侵略者。它们会随机向玩家射击，且不会被球摧毁。
- 玩家的目标是躲避侵略者的射击，并将球击向对方边缘。
- 球在左右边缘反弹。
- 球碰到玩家边缘，该玩家失败。
- 玩家被射击3次或更多次，该玩家失败。
- 最后存活的玩家获胜。
- 在UI上显示玩家击球次数和剩余生命值。

space invaders pong - prompt 代理将首先创建游戏： space invaders pong - structure

📚 详细文档

模型介绍

Devstral是由Mistral AI和All Hands AI合作开发的大语言模型，在软件工程任务中表现出色。它基于Mistral-Small-3.1微调而来，拥有128k的长上下文窗口。作为编码代理，Devstral仅处理文本，在微调前移除了视觉编码器。

对于有特殊需求（如增加上下文、特定领域知识等）的企业，我们将推出超越Mistral AI社区贡献的商业模型。

更新内容

与Devstral Small 1.0相比，Devstral Small 1.1有以下更新：

性能提升：具体请参考基准测试结果。
泛化能力增强：与OpenHands配合使用时依然出色，新版本在其他提示和编码环境中的泛化能力更强。
支持函数调用：支持Mistral的函数调用格式。

🔧 技术细节

基准测试结果

SWE-Bench

Devstral Small 1.1在SWE-Bench Verified测试中取得了53.6%的成绩，比Devstral Small 1.0高6.8%，比次优的先进模型高11.4%。

模型名称	代理脚手架	SWE-Bench Verified (%)
Devstral Small 1.1	OpenHands Scaffold	53.6
Devstral Small 1.0	OpenHands Scaffold	46.8
GPT-4.1-mini	OpenAI Scaffold	23.6
Claude 3.5 Haiku	Anthropic Scaffold	40.6
SWE-smith-LM 32B	SWE-agent Scaffold	40.2
Skywork SWE	OpenHands Scaffold	38.0
DeepSWE	R2E-Gym Scaffold	42.2