Mistral-Small-3.2-24B-Instruct-2506-GGUF开源模型 - 实现图像文本转换，减少文本重复错误

首页

Mistral Small 3.2 24B Instruct 2506 GGUF

由 unsloth 开发

Mistral-Small-3.2-24B-Instruct-2506 是一个图像文本到文本的模型，在模型量化方面表现出色，指令遵循、减少重复错误和函数调用等方面有显著提升。

图像生成文本支持多种语言开源协议:Apache-2.0 #多模态指令理解 #低重复率生成 #函数调用优化

下载量 8,640

发布时间 : 6/20/2025

模型简介

该模型是一个多语言支持的图像文本到文本模型，具有出色的量化性能和指令遵循能力，适用于多种任务。

模型特点

指令遵循

Small-3.2 更擅长遵循精确指令。

减少重复错误

Small-3.2 减少了无限生成或重复答案的情况。

函数调用

Small-3.2 的函数调用模板更加健壮。

量化性能

在模型量化方面达到了 SOTA 性能。

模型能力

图像文本转换

多语言支持

指令遵循

函数调用

减少重复错误

使用案例

视觉推理

图像分析

分析图像内容并生成文本描述或建议。

提供详细的图像分析和建议。

文本处理

文本重写

将输入文本重写为更简洁或更清晰的版本。

生成更简洁的文本。

函数调用

动态函数调用

根据用户需求动态调用函数完成任务。

成功调用函数并返回结果。

🚀 Mistral-Small-3.2-24B-Instruct-2506

Mistral-Small-3.2-24B-Instruct-2506 是一个图像文本到文本的模型，在模型量化方面具有出色表现，且在指令遵循、减少重复错误和函数调用等方面有显著提升。

🚀 快速开始

运行环境

支持多种语言，包括英语、法语、德语、西班牙语等。

许可证

采用 Apache-2.0 许可证。

运行代码

在 llama.cpp 中运行

./llama.cpp/llama-cli -hf unsloth/Mistral-Small-3.2-24B-Instruct-2506-GGUF:UD-Q4_K_XL --jinja --temp 0.15 --top-k -1 --top-p 1.00 -ngl 99

在 Ollama 中运行

ollama run hf.co/unsloth/Mistral-Small-3.2-24B-Instruct-2506-GGUF:UD-Q4_K_XL

注意事项

⚠️ 重要提示

本模型包含 GGUF 聊天模板修复！工具调用也能正常工作！如果你使用 llama.cpp，请使用 --jinja 来启用系统提示。

💡 使用建议

建议使用相对较低的温度，例如 temperature=0.15。同时，确保为模型添加系统提示，以满足你的特定需求。如果将模型用作通用助手，建议使用 SYSTEM_PROMPT.txt 文件中提供的提示。

Unsloth Dynamic 2.0 在模型量化方面达到了 SOTA 性能。

✨ 主要特性

与 Mistral-Small-3.1-24B-Instruct-2503 具有相同的核心特性。
指令遵循：Small-3.2 更擅长遵循精确指令。
重复错误：Small-3.2 减少了无限生成或重复答案的情况。
函数调用：Small-3.2 的函数调用模板更加健壮。

📦 安装指南

vLLM（推荐）

确保安装 vLLM >= 0.9.1：

pip install vllm --upgrade

安装完成后，应会自动安装 mistral_common >= 1.6.2。你可以使用以下命令进行检查：

python -c "import mistral_common; print(mistral_common.__version__)"

你还可以使用 docker 镜像或在 docker hub 上运行。

💻 使用示例

基础用法

启动服务器

vllm serve mistralai/Mistral-Small-3.2-24B-Instruct-2506 --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice --limit_mm_per_prompt 'image=10' --tensor-parallel-size 2

注意：在 GPU 上运行 Mistral-Small-3.2-24B-Instruct-2506 需要约 55 GB 的 GPU RAM（bf16 或 fp16）。

高级用法

视觉推理

from datetime import datetime, timedelta

from openai import OpenAI
from huggingface_hub import hf_hub_download

# Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

TEMP = 0.15
MAX_TOK = 131072

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

models = client.models.list()
model = models.data[0].id


def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    today = datetime.today().strftime("%Y-%m-%d")
    yesterday = (datetime.today() - timedelta(days=1)).strftime("%Y-%m-%d")
    model_name = repo_id.split("/")[-1]
    return system_prompt.format(name=model_name, today=today, yesterday=yesterday)


model_id = "mistralai/Mistral-Small-3.2-24B-Instruct-2506"
SYSTEM_PROMPT = load_system_prompt(model_id, "SYSTEM_PROMPT.txt")
image_url = "https://static.wikia.nocookie.net/essentialsdocs/images/7/70/Battle.png/revision/latest?cb=20220523172438"

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "What action do you think I should take in this situation? List all the possible actions and explain why you think they are good or bad.",
            },
            {"type": "image_url", "image_url": {"url": image_url}},
        ],
    },
]


response = client.chat.completions.create(
    model=model,
    messages=messages,
    temperature=TEMP,
    max_tokens=MAX_TOK,
)

print(response.choices[0].message.content)
# In this situation, you are playing a Pokémon game where your Pikachu (Level 42) is facing a wild Pidgey (Level 17). Here are the possible actions you can take and an analysis of each:

# 1. **FIGHT**:
#    - **Pros**: Pikachu is significantly higher level than the wild Pidgey, which suggests that it should be able to defeat Pidgey easily. This could be a good opportunity to gain experience points and possibly items or money.
#    - **Cons**: There is always a small risk of Pikachu fainting, especially if Pidgey has a powerful move or a status effect that could hinder Pikachu. However, given the large level difference, this risk is minimal.

# 2. **BAG**:
#    - **Pros**: You might have items in your bag that could help in this battle, such as Potions, Poké Balls, or Berries. Using an item could help you capture the Pidgey or heal your Pikachu if needed.
#    - **Cons**: Using items might not be necessary given the level difference. It could be more efficient to just fight and defeat the Pidgey quickly.

# 3. **POKÉMON**:
#    - **Pros**: You might have another Pokémon in your party that is better suited for this battle or that you want to gain experience. Switching Pokémon could also be a strategic move if you want to train a lower-level Pokémon.
#    - **Cons**: Switching Pokémon might not be necessary since Pikachu is at a significant advantage. It could also waste time and potentially give Pidgey a turn to attack.

# 4. **RUN**:
#    - **Pros**: Running away could save time and conserve your Pokémon's health and resources. If you are in a hurry or do not need the experience or items, running away is a safe option.
#    - **Cons**: Running away means you miss out on the experience points and potential items or money that you could gain from defeating the Pidgey. It also means you do not get the chance to capture the Pidgey if you wanted to.

# ### Recommendation:
# Given the significant level advantage, the best action is likely to **FIGHT**. This will allow you to quickly defeat the Pidgey, gain experience points, and potentially earn items or money. If you are concerned about Pikachu's health, you could use an item from your **BAG** to heal it before or during the battle. Running away or switching Pokémon does not seem necessary in this situation.

函数调用

from openai import OpenAI
from huggingface_hub import hf_hub_download

# Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"

TEMP = 0.15
MAX_TOK = 131072

client = OpenAI(
    api_key=openai_api_key,
    base_url=openai_api_base,
)

models = client.models.list()
model = models.data[0].id

def load_system_prompt(repo_id: str, filename: str) -> str:
    file_path = hf_hub_download(repo_id=repo_id, filename=filename)
    with open(file_path, "r") as file:
        system_prompt = file.read()
    return system_prompt

model_id = "mistralai/Mistral-Small-3.2-24B-Instruct-2506"
SYSTEM_PROMPT = load_system_prompt(model_id, "SYSTEM_PROMPT.txt")

image_url = "https://huggingface.co/datasets/patrickvonplaten/random_img/resolve/main/europe.png"

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_population",
            "description": "Get the up-to-date population of a given country.",
            "parameters": {
                "type": "object",
                "properties": {
                    "country": {
                        "type": "string",
                        "description": "The country to find the population of.",
                    },
                    "unit": {
                        "type": "string",
                        "description": "The unit for the population.",
                        "enum": ["millions", "thousands"],
                    },
                },
                "required": ["country", "unit"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "rewrite",
            "description": "Rewrite a given text for improved clarity",
            "parameters": {
                "type": "object",
                "properties": {
                    "text": {
                        "type": "string",
                        "description": "The input text to rewrite",
                    }
                },
            },
        },
    },
]

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {
        "role": "user",
        "content": "Could you please make the below article more concise?\n\nOpenAI is an artificial intelligence research laboratory consisting of the non-profit OpenAI Incorporated and its for-profit subsidiary corporation OpenAI Limited Partnership.",
    },
    {
        "role": "assistant",
        "content": "",
        "tool_calls": [
            {
                "id": "bbc5b7ede",
                "type": "function",
                "function": {
                    "name": "rewrite",
                    "arguments": '{"text": "OpenAI is an artificial intelligence research laboratory consisting of the non-profit OpenAI Incorporated and its for-profit subsidiary corporation OpenAI Limited Partnership."}',
                },
            }
        ],
    },
    {
        "role": "tool",
        "content": '{"action":"rewrite","outcome":"OpenAI is a FOR-profit company."}',
        "tool_call_id": "bbc5b7ede",
        "name": "rewrite",
    },
    {
        "role": "assistant",
        "content": "---\n\nOpenAI is a FOR-profit company.",
    },
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Can you tell me what is the biggest country depicted on the map?",
            },
            {
                "type": "image_url",
                "image_url": {
                    "url": image_url,
                },
            },
        ],
    }
]

response = client.chat.completions.create(
    model=model,
    messages=messages,
    temperature=TEMP,
    max_tokens=MAX_TOK,
    tools=tools,
    tool_choice="auto",
)

assistant_message = response.choices[0].message.content
print(assistant_message)
# The biggest country depicted on the map is Russia.

messages.extend([
    {"role": "assistant", "content": assistant_message},
    {"role": "user", "content": "What is the population of that country in millions?"},
])

response = client.chat.completions.create(
    model=model,
    messages=messages,
    temperature=TEMP,
    max_tokens=MAX_TOK,
    tools=tools,
    tool_choice="auto",
)

print(response.choices[0].message.tool_calls)
# [ChatCompletionMessageToolCall(id='3e92V6Vfo', function=Function(arguments='{"country": "Russia", "unit": "millions"}', name='get_current_population'), type='function')]

📚 详细文档

基准测试结果

将 Mistral-Small-3.2-24B 与 Mistral-Small-3.1-24B-Instruct-2503 进行了比较。如需与其他类似规模的模型进行更多比较，请查看 Mistral-Small-3.1 的基准测试。

文本测试

模型	Wildbench v2	Arena Hard v2	IF（内部；准确率）
Small 3.1 24B Instruct	55.6%	19.56%	82.75%
Small 3.2 24B Instruct	65.33%	43.1%	84.78%

无限生成测试

Small 3.2 在具有挑战性、长且重复的提示下，将无限生成情况减少了 2 倍。

模型	无限生成（内部；越低越好）
Small 3.1 24B Instruct	2.11%
Small 3.2 24B Instruct	1.29%

STEM 测试

模型	MMLU	MMLU Pro（5-shot CoT）	MATH	GPQA Main（5-shot CoT）	GPQA Diamond（5-shot CoT）	MBPP Plus - Pass@5	HumanEval Plus - Pass@5	SimpleQA（总准确率）
Small 3.1 24B Instruct	80.62%	66.76%	69.30%	44.42%	45.96%	74.63%	88.99%	10.43%
Small 3.2 24B Instruct	80.50%	69.06%	69.42%	44.22%	46.13%	78.33%	92.90%	12.10%