Devstral-Small-2505-GGUF開源LLM - 助力軟件工程，支持代碼探索與多文件編輯

首頁

Devstral Small 2505 GGUF

由unsloth開發

Devstral是一款專為軟件工程任務設計的智能LLM，由Mistral AI和All Hands AI合作開發，擅長代碼探索、多文件編輯和驅動軟件工程代理。

大型語言模型支持多種語言開源協議:Apache-2.0 #智能編碼代理 #128k長上下文 #多語言編程支持

下載量 72.26k

發布時間 : 5/21/2025

模型概述

Devstral是基於Mistral-Small-3.1微調而來的智能編碼模型，具有128k tokens的上下文窗口，專注於軟件工程任務，在SWE-bench上表現優異。

模型特點

智能編碼

專為智能編碼任務設計，是軟件工程代理的理想選擇

輕量級

240億參數的緊湊尺寸，可在單個RTX 4090或32GB RAM的Mac上運行

長上下文窗口

支持128k tokens的上下文窗口

多語言支持

支持24種語言，包括主要編程語言和自然語言

開源許可

採用Apache 2.0許可證，允許商業和非商業用途

模型能力

代碼探索

多文件編輯

軟件工程代理

多語言文本生成

長上下文處理

使用案例

軟件開發

代碼庫探索

幫助開發者理解和探索大型代碼庫

提高代碼理解和維護效率

多文件代碼編輯

同時編輯多個相關代碼文件

保持代碼一致性，提高開發效率

自動化軟件工程

作為軟件工程代理自動執行開發任務

減少重複性工作，加速開發流程

🚀 Devstral - 增強版代碼大模型

Devstral是一款專為軟件工程任務打造的大語言模型，由Mistral AI和All Hands AI合作開發。它在處理代碼相關任務時表現出色，能有效探索代碼庫、編輯多個文件，為軟件工程代理提供強大支持。該模型在SWE-bench基準測試中成績優異，是開源模型中的佼佼者。

🚀 快速開始

免費微調

你可以使用我們的Google Colab筆記本免費微調Mistral v0.3 (7B)。

瞭解更多

閱讀我們關於Devstral支持的博客：docs.unsloth.ai/basics/devstral
查看我們的其他筆記本：文檔

✨ 主要特性

智能編碼：專為智能編碼任務設計，是軟件工程代理的理想選擇。
輕量級：僅240億參數，體積小巧，可在單張RTX 4090或32GB RAM的Mac上運行，適合本地部署和設備端使用。
開源許可：採用Apache 2.0許可證，允許商業和非商業用途的使用和修改。
長上下文窗口：擁有高達128k的上下文窗口。
分詞器：使用Tekken分詞器，詞彙量達131k。

📊 基準測試結果

SWE-Bench

Devstral在SWE-Bench Verified測試中得分46.8%，比之前的開源最優模型高出6%。

模型	腳手架	SWE-Bench Verified (%)
Devstral	OpenHands Scaffold	46.8
GPT-4.1-mini	OpenAI Scaffold	23.6
Claude 3.5 Haiku	Anthropic Scaffold	40.6
SWE-smith-LM 32B	SWE-agent Scaffold	40.2

在相同的測試腳手架（OpenHands，由All Hands AI提供）下，Devstral的表現遠超Deepseek-V3-0324和Qwen3 232B-A22B等更大的模型。

💻 使用示例

基礎用法

我們推薦使用OpenHands腳手架與Devstral交互。你可以通過API或本地推理的方式使用該模型。

API使用

按照說明創建Mistral賬戶並獲取API密鑰。
運行以下命令啟動OpenHands Docker容器：

export MISTRAL_API_KEY=<MY_KEY>

docker pull docker.all-hands.dev/all-hands-ai/runtime:0.39-nikolaik

mkdir -p ~/.openhands-state && echo '{"language":"en","agent":"CodeActAgent","max_iterations":null,"security_analyzer":null,"confirmation_mode":false,"llm_model":"mistral/devstral-small-2505","llm_api_key":"'$MISTRAL_API_KEY'","remote_runtime_resource_factor":null,"github_token":null,"enable_default_condenser":true}' > ~/.openhands-state/settings.json

docker run -it --rm --pull=always \
    -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.39-nikolaik \
    -e LOG_ALL_EVENTS=true \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v ~/.openhands-state:/.openhands-state \
    -p 3000:3000 \
    --add-host host.docker.internal:host-gateway \
    --name openhands-app \
    docker.all-hands.dev/all-hands-ai/openhands:0.39

本地推理

你也可以在本地運行該模型，可使用LMStudio或其他以下列出的庫。

啟動Openhands

docker pull docker.all-hands.dev/all-hands-ai/runtime:0.38-nikolaik
docker run -it --rm --pull=always \
    -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.38-nikolaik \
    -e LOG_ALL_EVENTS=true \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v ~/.openhands-state:/.openhands-state \
    -p 3000:3000 \
    --add-host host.docker.internal:host-gateway \
    --name openhands-app \
    docker.all-hands.dev/all-hands-ai/openhands:0.38

服務器將在http://0.0.0.0:3000啟動，在瀏覽器中打開該地址，你將看到一個“AI Provider Configuration”選項卡。現在你可以通過點擊左側欄的加號與代理開始新的對話。

高級用法

OpenHands（推薦）

啟動服務器部署Devstral-Small-2505

確保你已經按照上述說明啟動了一個兼容OpenAI的服務器，如vLLM或Ollama。然後，你可以使用OpenHands與Devstral-Small-2505進行交互。

在本教程中，我們啟動一個vLLM服務器：

vllm serve mistralai/Devstral-Small-2505 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --tensor-parallel-size 2

服務器地址應採用以下格式：http://<your-server-url>:8000/v1

啟動OpenHands

你可以按照此處的說明安裝OpenHands。

最簡單的啟動方式是使用Docker鏡像：

docker pull docker.all-hands.dev/all-hands-ai/runtime:0.38-nikolaik

docker run -it --rm --pull=always \
    -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.38-nikolaik \
    -e LOG_ALL_EVENTS=true \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v ~/.openhands-state:/.openhands-state \
    -p 3000:3000 \
    --add-host host.docker.internal:host-gateway \
    --name openhands-app \
    docker.all-hands.dev/all-hands-ai/openhands:0.38

然後，你可以在http://localhost:3000訪問OpenHands UI。

連接到服務器

訪問OpenHands UI時，系統會提示你連接到服務器。你可以使用高級模式連接到之前啟動的服務器。

填寫以下字段：

自定義模型：openai/mistralai/Devstral-Small-2505
基礎URL：http://<your-server-url>:8000/v1
API密鑰：token（或啟動服務器時使用的任何其他令牌）

使用由Devstral驅動的OpenHands

現在你可以通過開始新對話在OpenHands中使用Devstral Small。讓我們來構建一個待辦事項列表應用。

待辦事項列表應用

讓我們使用以下提示讓Devstral生成應用：

Build a To-Do list app with the following requirements:
- Built using FastAPI and React.
- Make it a one page app that:
   - Allows to add a task.
   - Allows to delete a task.
   - Allows to mark a task as done.
   - Displays the list of tasks.
- Store the tasks in a SQLite database.

Agent prompting

查看結果你應該會看到代理構建應用，並能夠查看它生成的代碼。

如果它沒有自動完成，你可以讓Devstral部署應用或手動部署，然後訪問前端URL以查看應用。

Agent working App UI

迭代優化現在你已經有了第一個結果，你可以通過要求代理改進它來進行迭代。例如，在生成的應用中，我們可以點擊任務將其標記為已完成，但添加一個複選框會改善用戶體驗。你還可以要求它添加編輯任務的功能，或添加按狀態過濾任務的功能。

享受使用Devstral Small和OpenHands進行開發的樂趣！

LMStudio（推薦用於量化模型）

從Hugging Face下載權重：

pip install -U "huggingface_hub[cli]"
huggingface-cli download \
"mistralai/Devstral-Small-2505_gguf" \
--include "devstralQ4_K_M.gguf" \
--local-dir "mistralai/Devstral-Small-2505_gguf/"

你可以使用LMStudio在本地提供模型服務：

下載並安裝LM Studio
安裝lms cli ~/.lmstudio/bin/lms bootstrap
在bash終端中，在下載模型檢查點的目錄（例如mistralai/Devstral-Small-2505_gguf）中運行lms import devstralQ4_K_M.ggu
打開LMStudio應用程序，點擊終端圖標進入開發者選項卡。點擊“選擇要加載的模型”並選擇Devstral Q4 K M。切換狀態按鈕以啟動模型，在設置中切換“在本地網絡上服務”為開啟狀態。
在右側選項卡中，你將看到一個API標識符（應該是devstralq4_k_m）和一個API地址。記錄此地址，我們將在下一步中使用。

啟動Openhands：

docker pull docker.all-hands.dev/all-hands-ai/runtime:0.38-nikolaik
docker run -it --rm --pull=always \
    -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.38-nikolaik \
    -e LOG_ALL_EVENTS=true \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v ~/.openhands-state:/.openhands-state \
    -p 3000:3000 \
    --add-host host.docker.internal:host-gateway \
    --name openhands-app \
    docker.all-hands.dev/all-hands-ai/openhands:0.38

點擊第二行的“查看高級設置”。在新選項卡中，將高級模式切換為開啟狀態。將自定義模型設置為mistral/devstralq4_k_m，將基礎URL設置為我們在LM Studio的最後一步中獲得的API地址。將API密鑰設置為dummy。點擊“保存更改”。

vLLM（推薦）

我們推薦使用vLLM庫來實現生產就緒的推理管道。

安裝確保你安裝了vLLM >= 0.8.5：

pip install vllm --upgrade

這樣做應該會自動安裝mistral_common >= 1.5.4。

檢查安裝情況：

python -c "import mistral_common; print(mistral_common.__version__)"

你還可以使用現成的Docker鏡像或在Docker Hub上的鏡像。

服務器部署

我們建議在服務器/客戶端設置中使用Devstral。

啟動服務器：

vllm serve mistralai/Devstral-Small-2505 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --tensor-parallel-size 2

你可以使用以下簡單的Python代碼片段來測試客戶端：

import requests
import json
from huggingface_hub import hf_hub_download


url = "http://<your-server-url>:8000/v1/chat/completions"
headers = {"Content-Type": "application/json", "Authorization": "Bearer token"}

model = "mistralai/Devstral-Small-2505"

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt(model, "SYSTEM_PROMPT.txt")

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Write a function that computes fibonacci in Python.",
            },
        ],
    },
]

data = {"model": model, "messages": messages, "temperature": 0.15}

response = requests.post(url, headers=headers, data=json.dumps(data))
print(response.json()["choices"][0]["message"]["content"])

輸出

當然！斐波那契數列是一個數字序列，其中每個數字是前兩個數字的和，通常從0和1開始。以下是一個簡單的Python函數來計算斐波那契數列：

迭代方法

這種方法使用循環迭代計算斐波那契數。

def fibonacci(n):
    if n <= 0:
        return "Input should be a positive integer."
    elif n == 1:
        return 0
    elif n == 2:
        return 1

    a, b = 0, 1
    for _ in range(2, n):
        a, b = b, a + b
    return b

# 示例用法:
print(fibonacci(10))  # 輸出: 34

遞歸方法

這種方法使用遞歸計算斐波那契數。請注意，對於較大的n，由於重複計算，這種方法效率較低。

def fibonacci_recursive(n):
    if n <= 0:
        return "Input should be a positive integer."
    elif n == 1:
        return 0
    elif n == 2:
        return 1
    else:
        return fibonacci_recursive(n - 1) + fibonacci_recursive(n - 2)

# 示例用法:
print(fibonacci_recursive(10))  # 輸出: 34

記憶化方法

這種方法使用記憶化來存儲先前計算的斐波那契數，比簡單的遞歸方法更高效。

def fibonacci_memo(n, memo={}):
    if n <= 0:
        return "Input should be a positive integer."
    elif n == 1:
        return 0
    elif n == 2:
        return 1
    elif n in memo:
        return memo[n]

    memo[n] = fibonacci_memo(n - 1, memo) + fibonacci_memo(n - 2, memo)
    return memo[n]

# 示例用法:
print(fibonacci_memo(10))  # 輸出: 34

動態規劃方法

這種方法使用數組存儲直到n的斐波那契數。

def fibonacci_dp(n):
    if n <= 0:
        return "Input should be a positive integer."
    elif n == 1:
        return 0
    elif n == 2:
        return 1

    fib = [0, 1] + [0] * (n - 2)
    for i in range(2, n):
        fib[i] = fib[i - 1] + fib[i - 2]
    return fib[n - 1]

# 示例用法:
print(fibonacci_dp(10))  # 輸出: 34

你可以根據需要選擇任何一種方法。對於較大的n值，迭代和動態規劃方法通常更高效。

Mistral-inference

我們推薦使用mistral-inference快速試用Devstral。

安裝確保安裝了mistral_inference >= 1.6.0：

pip install mistral_inference --upgrade

下載

from huggingface_hub import snapshot_download
from pathlib import Path

mistral_models_path = Path.home().joinpath('mistral_models', 'Devstral')
mistral_models_path.mkdir(parents=True, exist_ok=True)

snapshot_download(repo_id="mistralai/Devstral-Small-2505", allow_patterns=["params.json", "consolidated.safetensors", "tekken.json"], local_dir=mistral_models_path)

Python運行 你可以使用以下命令運行模型：

mistral-chat $HOME/mistral_models/Devstral --instruct --max_tokens 300

如果你輸入提示“Write me a unique and efficient function that computes fibonacci in Python”，模型應該會生成類似以下的內容：

輸出

當然！計算斐波那契數的一種常見且高效的方法是使用記憶化來存儲先前計算的值。這避免了冗餘計算並顯著提高了性能。以下是一個使用記憶化高效計算斐波那契數的Python函數：

def fibonacci(n, memo=None):
    if memo is None:
        memo = {}

    if n in memo:
        return memo[n]

    if n <= 1:
        return n

    memo[n] = fibonacci(n - 1, memo) + fibonacci(n - 2, memo)
    return memo[n]

# 示例用法:
n = 10
print(f"Fibonacci number at position {n} is {fibonacci(n)}")

解釋:

基本情況：如果n為0或1，函數返回n，因為斐波那契數列從0和1開始。
記憶化：函數使用字典memo存儲先前計算的斐波那契數的結果。
遞歸情況：對於其他n值，函數通過遞歸計算fibonacci(n - 1)和fibonacci(n - 2)的結果並求和來計算斐波那契數。

Ollama

你可以使用Ollama CLI運行Devstral：

ollama run devstral

Transformers

為了充分利用我們的模型與transformers庫，確保安裝了mistral-common >= 1.5.5以使用我們的分詞器：

pip install mistral-common --upgrade

然後加載我們的分詞器和模型並生成文本：

import torch

from mistral_common.protocol.instruct.messages import (
    SystemMessage, UserMessage
)
from mistral_common.protocol.instruct.request import ChatCompletionRequest
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.tokens.tokenizers.tekken import SpecialTokenPolicy
from huggingface_hub import hf_hub_download
from transformers import AutoModelForCausalLM

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

model_id = "mistralai/Devstral-Small-2505"
tekken_file = hf_hub_download(repo_id=model_id, filename="tekken.json")
SYSTEM_PROMPT = load_system_prompt(model_id, "SYSTEM_PROMPT.txt")

tokenizer = MistralTokenizer.from_file(tekken_file)

model = AutoModelForCausalLM.from_pretrained(model_id)

tokenized = tokenizer.encode_chat_completion(
    ChatCompletionRequest(
        messages=[
            SystemMessage(content=SYSTEM_PROMPT),
            UserMessage(content="Write me a function that computes fibonacci in Python."),
        ],
    )
)

output = model.generate(
    input_ids=torch.tensor([tokenized.tokens]),
    max_new_tokens=1000,
)[0]

decoded_output = tokenizer.decode(output[len(tokenized.tokens):])
print(decoded_output)