Devstral-Small-2507-GGUFオープンソース大規模言語モデル - ソフトウェアエンジニアリングを支援し、ツール呼び出しと複数ファイル編集をサポート

ホーム

Devstral Small 2507 GGUF

unslothによって開発

Devstral 1.1は、ソフトウェアエンジニアリングタスク用に設計された大規模言語モデルで、ツール呼び出しとビジュアル機能をサポートし、コードライブラリの探索や複数ファイルの編集に適しています。

大規模言語モデル複数言語対応オープンソースライセンス:Apache-2.0 #インテリジェントコーディングエージェント #多言語ソフトウェアエンジニアリング #128k長コンテキスト

ダウンロード数 16.16k

リリース時間 : 7/10/2025

モデル概要

Devstral Small 1.1は軽量な大規模言語モデルで、インテリジェントコーディングタスク用に設計され、多言語とツール呼び出しをサポートし、ローカルデプロイやデバイス端での使用に適しています。

モデル特徴

インテリジェントコーディング

インテリジェントコーディングタスク用に設計され、ソフトウェアエンジニアリングエージェントの理想的な選択肢です。

軽量

240億パラメータのみで、単一のRTX 4090または32GB RAMのMacで動作し、ローカルデプロイやデバイス端での使用に適しています。

オープンソースライセンス

Apache 2.0ライセンスを採用しており、商用および非商用用途での使用と修正が可能です。

長コンテキストウィンドウ

128kのコンテキストウィンドウをサポートし、長いテキストや複雑なタスクの処理に適しています。

ツール呼び出しサポート

ツール呼び出しをサポートし、コードライブラリの探索や複数ファイルの編集を効率的に行えます。

モデル能力

テキスト生成

コード生成

コード編集

多言語サポート

ツール呼び出し

使用事例

ソフトウェア開発

コードライブラリ分析

コードライブラリのテストカバレッジを分析し、可視化グラフを生成します。

カバレッジ分布図、円グラフ、要約図を生成します。

ゲーム開発

《スペースインベーダー》と《ピンポン》を融合したウェブビデオゲームを開発します。

2人プレイヤー制御とインベーダー射撃メカニズムを持つゲームを作成します。

🚀 Devstral 1.1

Devstral 1.1は、ツール呼び出しとオプションのビジョンサポートを備えたモデルです。ソフトウェアエンジニアリングタスクに最適化されており、コードベースの探索やファイル編集などの作業を効率的に行うことができます。

🚀 クイックスタート

⚠️ 重要な注意

llama.cpp でシステムプロンプトを有効にするには、--jinja を使用する必要があります。

Devstralの正しい実行方法を学ぶ - ガイドを読む
Unsloth Dynamic 2.0 - Unsloth Dynamic 2.0 は、優れた精度を達成し、他の主要な量子化手法を上回ります。

無料でMistral v0.3 (7B) をファインチューニングする - Google Colabノートブックを使用してください。
Devstral 1.1サポートに関するブログを読む - docs.unsloth.ai/basics/devstral
他のノートブックを見る - ドキュメントを参照してください。

✨ 主な機能

エージェント型コーディング：Devstralは、エージェント型コーディングタスクで優れた性能を発揮するように設計されており、ソフトウェアエンジニアリングエージェントに最適な選択肢です。
軽量：わずか240億のパラメータで構成されているため、単一のRTX 4090または32GB RAMのMacでも実行でき、ローカルデプロイやデバイス上での使用に適したモデルです。
Apache 2.0ライセンス：商用および非商用目的での使用と変更が許可されるオープンライセンスです。
コンテキストウィンドウ：最大128kトークンのコンテキストウィンドウを持ちます。
トークナイザー：131kの語彙サイズを持つTekkenトークナイザーを使用しています。

📦 インストール

API

手順に従って、Mistralアカウントを作成し、APIキーを取得します。
次のコマンドを実行して、OpenHandsのDockerコンテナを起動します。

export MISTRAL_API_KEY=<MY_KEY>

mkdir -p ~/.openhands && echo '{"language":"en","agent":"CodeActAgent","max_iterations":null,"security_analyzer":null,"confirmation_mode":false,"llm_model":"mistral/devstral-small-2507","llm_api_key":"'$MISTRAL_API_KEY'","remote_runtime_resource_factor":null,"github_token":null,"enable_default_condenser":true}' > ~/.openhands-state/settings.json

docker pull docker.all-hands.dev/all-hands-ai/runtime:0.48-nikolaik

docker run -it --rm --pull=always \
    -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.48-nikolaik \
    -e LOG_ALL_EVENTS=true \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v ~/.openhands:/.openhands \
    -p 3000:3000 \
    --add-host host.docker.internal:host-gateway \
    --name openhands-app \
    docker.all-hands.dev/all-hands-ai/openhands:0.48

ローカル推論

このモデルは、以下のライブラリを使用してデプロイすることができます。

vLLM (推奨)

展開

このモデルを vLLMライブラリと共に使用して、本番環境で使用可能な推論パイプラインを実装することをお勧めします。

インストール

vLLM >= 0.9.1 をインストールしてください。

pip install vllm --upgrade

また、mistral_common >= 1.7.0 もインストールしてください。

pip install mistral-common --upgrade

確認するには、以下のコマンドを実行します。

python -c "import mistral_common; print(mistral_common.__version__)"

Dockerイメージまたは Docker Hub を使用することもできます。

サーバーの起動

Devstralをサーバー/クライアント環境で使用することをお勧めします。

サーバーを起動します。

vllm serve mistralai/Devstral-Small-2507 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --tensor-parallel-size 2

クライアントをpingするには、以下のPythonコードを使用します。

import requests
import json
from huggingface_hub import hf_hub_download


url = "http://<your-server-url>:8000/v1/chat/completions"
headers = {"Content-Type": "application/json", "Authorization": "Bearer token"}

model = "mistralai/Devstral-Small-2507"

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

SYSTEM_PROMPT = load_system_prompt(model, "SYSTEM_PROMPT.txt")

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "<your-command>",
            },
        ],
    },
]

data = {"model": model, "messages": messages, "temperature": 0.15}

# Devstral Small 1.1 supports tool calling. If you want to use tools, follow this:
# tools = [ # Define tools for vLLM
#     {
#         "type": "function",
#         "function": {
#             "name": "git_clone",
#             "description": "Clone a git repository",
#             "parameters": {
#                 "type": "object",
#                 "properties": {
#                     "url": {
#                         "type": "string",
#                         "description": "The url of the git repository",
#                     },
#                 },
#                 "required": ["url"],
#             },
#         },
#     }
# ] 
# data = {"model": model, "messages": messages, "temperature": 0.15, "tools": tools} # Pass tools to payload.

response = requests.post(url, headers=headers, data=json.dumps(data))
print(response.json()["choices"][0]["message"]["content"])

Mistral-inference

展開

Devstralをすばやく試すために、mistral-inferenceを使用することをお勧めします。

インストール

mistral_inference >= 1.6.0をインストールしてください。

pip install mistral_inference --upgrade

ダウンロード

from huggingface_hub import snapshot_download
from pathlib import Path

mistral_models_path = Path.home().joinpath('mistral_models', 'Devstral')
mistral_models_path.mkdir(parents=True, exist_ok=True)

snapshot_download(repo_id="mistralai/Devstral-Small-2507", allow_patterns=["params.json", "consolidated.safetensors", "tekken.json"], local_dir=mistral_models_path)

チャット

以下のコマンドを使用してモデルを実行します。

mistral-chat $HOME/mistral_models/Devstral --instruct --max_tokens 300

任意のプロンプトを入力することができます。

Transformers

展開

transformersでこのモデルを最大限に活用するには、mistral-common >= 1.7.0 をインストールして、トークナイザーを使用してください。

pip install mistral-common --upgrade

次に、トークナイザーとモデルを読み込み、生成を行います。

import torch

from mistral_common.protocol.instruct.messages import (
    SystemMessage, UserMessage
)
from mistral_common.protocol.instruct.request import ChatCompletionRequest
from mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from huggingface_hub import hf_hub_download
from transformers import AutoModelForCausalLM

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

model_id = "mistralai/Devstral-Small-2507"
SYSTEM_PROMPT = load_system_prompt(model_id, "SYSTEM_PROMPT.txt")


tokenizer = MistralTokenizer.from_hf_hub(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

tokenized = tokenizer.encode_chat_completion(
    ChatCompletionRequest(
        messages=[
            SystemMessage(content=SYSTEM_PROMPT),
            UserMessage(content="<your-command>"),
        ],
    )
)

output = model.generate(
    input_ids=torch.tensor([tokenized.tokens]),
    max_new_tokens=1000,
)[0]

decoded_output = tokenizer.decode(output[len(tokenized.tokens):])
print(decoded_output)

LM Studio

展開

以下のいずれかから重みをダウンロードします。

LM Studio GGUFリポジトリ (推奨): https://huggingface.co/lmstudio-community/Devstral-Small-2507-GGUF
私たちのGGUFリポジトリ: https://huggingface.co/mistralai/Devstral-Small-2507_gguf

pip install -U "huggingface_hub[cli]"
huggingface-cli download \
"lmstudio-community/Devstral-Small-2507-GGUF" \ # or mistralai/Devstral-Small-2507_gguf
--include "Devstral-Small-2507-Q4_K_M.gguf" \
--local-dir "Devstral-Small-2507_gguf/"

LMStudio を使用してモデルをローカルでサーブすることができます。

LM Studio をダウンロードしてインストールします。
lms cli ~/.lmstudio/bin/lms bootstrap をインストールします。
bashターミナルで、モデルチェックポイントをダウンロードしたディレクトリ (例: Devstral-Small-2507_gguf) で lms import Devstral-Small-2507-Q4_K_M.gguf を実行します。
LM Studioアプリケーションを開き、ターミナルアイコンをクリックして開発者タブに移動します。「モデルを選択して読み込む」をクリックし、Devstral Small 2507 を選択します。ステータスボタンを切り替えてモデルを起動し、設定で「ローカルネットワークでサーブ」をオンにします。
右側のタブに、API識別子 (devstral-small-2507) とAPIアドレスが表示されます。このアドレスをメモしておき、OpenHandsまたはClineで使用します。

llama.cpp

展開

Hugging Faceから重みをダウンロードします。

pip install -U "huggingface_hub[cli]"
huggingface-cli download \
"mistralai/Devstral-Small-2507_gguf" \
--include "Devstral-Small-2507-Q4_K_M.gguf" \
--local-dir "mistralai/Devstral-Small-2507_gguf/"

次に、llama.cppサーバーを使用してDevstralを実行します。

./llama-server -m mistralai/Devstral-Small-2507_gguf/Devstral-Small-2507-Q4_K_M.gguf -c 0 # -c configure the context size, 0 means model's default, here 128k.

OpenHands (推奨)

Devstral Small 1.1をデプロイするサーバーの起動

上述のように、vLLMまたはOllamaなどのOpenAI互換サーバーを起動してください。その後、OpenHandsを使用して Devstral Small 1.1 と対話することができます。

チュートリアルの場合は、以下のコマンドを実行してvLLMサーバーを起動します。

vllm serve mistralai/Devstral-Small-2507 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --tensor-parallel-size 2

サーバーアドレスは、http://<your-server-url>:8000/v1 の形式になります。

OpenHandsの起動

こちらの手順に従ってOpenHandsをインストールしてください。

OpenHandsを起動する最も簡単な方法は、Dockerイメージを使用することです。

docker pull docker.all-hands.dev/all-hands-ai/runtime:0.48-nikolaik

docker run -it --rm --pull=always \
    -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.48-nikolaik \
    -e LOG_ALL_EVENTS=true \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v ~/.openhands:/.openhands \
    -p 3000:3000 \
    --add-host host.docker.internal:host-gateway \
    --name openhands-app \
    docker.all-hands.dev/all-hands-ai/openhands:0.48

その後、http://localhost:3000 でOpenHands UIにアクセスすることができます。

サーバーへの接続

OpenHands UIにアクセスすると、サーバーに接続するように促されます。詳細モードを使用して、先に起動したサーバーに接続することができます。

以下のフィールドを入力します。

カスタムモデル: openai/mistralai/Devstral-Small-2507
ベースURL: http://<your-server-url>:8000/v1
APIキー: token (サーバーを起動する際に使用したトークンがあれば、それを使用します)

設定を表示

OpenHands Settings

Cline

Devstral Small 1.1をデプロイするサーバーの起動

チュートリアルの場合は、以下のコマンドを実行してvLLMサーバーを起動します。

vllm serve mistralai/Devstral-Small-2507 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --tensor-parallel-size 2

サーバーアドレスは、http://<your-server-url>:8000/v1 の形式になります。

Clineの起動

こちらの手順に従ってClineをインストールし、設定でサーバーアドレスを構成してください。

設定を表示

Cline Settings

💻 使用例

OpenHands: Mistral Commonのテストカバレッジの理解

OpenHandsスキャフォールドを起動し、リポジトリにリンクして、テストカバレッジを分析し、カバレッジの低いファイルを特定することができます。ここでは、公開されている mistral-common リポジトリから始めます。

リポジトリがワークスペースにマウントされた後、以下の指示を与えます。

Check the test coverage of the repo and then create a visualization of test coverage. Try plotting a few different types of graphs and save them to a png.

エージェントはまず、コードベースを参照してテスト構成と構造を確認します。

mistral common coverage - prompt

次に、テスト依存関係を設定し、カバレッジテストを開始します。

mistral common coverage - dependencies

最後に、エージェントはカバレッジを視覚化するために必要なコードを記述し、結果をエクスポートし、プロットをpngファイルに保存します。 mistral common coverage - visualization

実行の最後に、以下のプロットが生成されます。 mistral common coverage - coverage distribution

また、モデルは結果を説明することができます。 mistral common coverage - navigate

Cline: ビデオゲームの作成

まず、VSCode内でClineを初期化し、先に起動したサーバーに接続します。

ビデオゲームを作成するために、以下の指示を与えます。

Create a video game that mixes Space Invaders and Pong for the web.

Follow these instructions:
- There are two players one at the top and one at the bottom. The players are controling a bar to bounce a ball.
- The first player plays with the keys "a" and "d", the second with the right and left arrows.
- The invaders are located at the center of the screen. They shoud look like the ones in Space Invaders. Their goal is to shoot on the players randomly. They cannot be destroyed by the ball that pass through them. This means that invaders never die.
- The players goal is to avoid shootings from the space invaders and send the ball to the edge of the over player.
- The ball bounces on the left and right edges.
- Once the ball touch one of the player's edge, the player loses.
- Once a player is touched 3 times or more by a shooting, the player loses.
- The player winning is the last one standing.
- Display on the UI, the number of times a player touched the ball, and the remaining health.

space invaders pong - prompt

エージェントはまず、ゲームを作成します。

![space invaders pong - structure](assets/space_invaders_pong/base_st

📚 詳細ドキュメント

Devstralに関する詳細情報は、ブログ記事を参照してください。

🔧 技術詳細

Devstralは、Mistral AI と All Hands AI の共同開発によるエージェント型LLMです。ソフトウェアエンジニアリングタスクに特化しており、コードベースの探索や複数ファイルの編集、ソフトウェアエンジニアリングエージェントの強化に優れた性能を発揮します。

このモデルは、Mistral-Small-3.1 からファインチューニングされており、最大128kトークンの長いコンテキストウィンドウを持ちます。コーディングエージェントとして、Devstralはテキストのみを扱い、Mistral-Small-3.1 からのファインチューニング前にビジョンエンコーダーが削除されています。

ベンチマーク結果

SWE-Bench

Devstral Small 1.1は、SWE-Bench Verifiedで 53.6% のスコアを達成し、Devstral Small 1.0を+6.8%上回り、2番目に良い最先端モデルを+11.4%上回ります。

モデル	エージェント型スキャフォールド	SWE-Bench Verified (%)
Devstral Small 1.1	OpenHands Scaffold	53.6
Devstral Small 1.0	OpenHands Scaffold	46.8
GPT-4.1-mini	OpenAI Scaffold	23.6
Claude 3.5 Haiku	Anthropic Scaffold	40.6
SWE-smith-LM 32B	SWE-agent Scaffold	40.2
Skywork SWE	OpenHands Scaffold	38.0
DeepSWE	R2E-Gym Scaffold	42.2