Arch-Agent-1.5B-GGUF開源大語言模型 - 高效處理複雜多步驟任務與代理應用

首頁

Arch Agent 1.5B GGUF

由Mungert開發

Arch-Agent-1.5B GGUF 模型是專為高級函數調用和基於代理的應用程序設計的最先進大語言模型集合，能出色處理複雜的多步驟任務，在複雜場景中表現卓越。

大型語言模型

Transformers

英語開源協議:其他 #多輪函數調用 #智能代理決策 #64K長上下文

下載量 533

發布時間 : 6/25/2025

模型概述

該模型專為高級函數調用和代理應用設計，支持多輪和多步函數調用，具備智能代理能力，適用於複雜任務處理。

模型特點

多輪函數調用

在多個對話回合中保持上下文連續性，支持自然、持續的對話，可嵌套或動態使用工具。

多步函數調用

規劃並執行一系列函數調用以完成複雜任務，能根據中間結果動態調整，將目標分解為子任務。

智能代理能力

具備高級決策和工作流管理能力，可處理複雜的代理任務，實現工具的無縫協調和錯誤恢復。

模型能力

高級函數調用

多輪對話處理

複雜任務分解

智能代理決策

工作流管理

使用案例

網絡監控

自動化 Nmap 安全掃描

使用模型自動化執行 Nmap 安全掃描任務。

量子就緒性檢查

檢查服務器是否使用量子安全加密進行通信。

安全審計

服務器安全審計

運行全面的安全審計任務。

🚀 Arch-Agent-1.5B GGUF 模型

🚀 快速開始

環境要求

Arch-Agent-1.5B 的代碼已集成在 Hugging Face 的 transformers 庫中，建議安裝最新版本：

pip install transformers>=4.51.0

使用示例

以下示例展示瞭如何使用該模型執行函數調用任務，請注意，該模型配合提供的提示格式使用效果最佳，它能提取類似於 OpenAI 函數調用的 JSON 輸出。

基礎用法

import json
from typing import Any, Dict, List
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "katanemo/Arch-Agent-1.5B"

model = AutoModelForCausalLM.from_pretrained(
    model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

TASK_PROMPT = (
    "You are a helpful assistant designed to assist with the user query by making one or more function calls if needed."
    "\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\n"
    "You are provided with function signatures within <tools></tools> XML tags:\n<tools>\n{tool_text}"
    "\n</tools>\n\nFor each function call, return a json object with function name and arguments within "
    """<tool_call></tool_call> XML tags:\n<tool_call>\n{{"name": <function-name>, """
    """"arguments": <args-json-object>}}\n</tool_call>"""
)

# Define available tools
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "str",
                        "description": "The city and state, e.g. San Francisco, New York",
                    },
                    "unit": {
                        "type": "str",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The unit of temperature to return",
                    },
                },
                "required": ["location"],
            },
        },
    }
]

# Helper function to create the system prompt for our model
def format_prompt(tools: List[Dict[str, Any]]):
    tool_text = "\n".join(
        [json.dumps(tool["function"], ensure_ascii=False) for tool in tools]
    )
    return TASK_PROMPT.format(tool_text=tool_text)

system_prompt = format_prompt(tools)

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": "What is the weather in Seattle?"},
]

model_inputs = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True, return_tensors="pt", return_dict=True
).to(model.device)

generated_ids = model.generate(**model_inputs, max_new_tokens=32768)
generated_ids = [
    output_ids[len(input_ids) :]
    for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

✨ 主要特性

多輪函數調用：在多個對話回合中保持上下文連續性，支持自然、持續的對話，可嵌套或動態使用工具。
多步函數調用：規劃並執行一系列函數調用以完成複雜任務，能根據中間結果動態調整，將目標分解為子任務。
智能代理能力：具備高級決策和工作流管理能力，可處理複雜的代理任務，實現工具的無縫協調和錯誤恢復。

📚 詳細文檔

有關微調、推理和部署的更多詳細信息，請參考 Github。

🔧 技術細節

模型生成

該模型使用 llama.cpp 在提交版本 0142961a 上生成。

性能基準

在伯克利函數調用排行榜 (BFCL) 上對 Katanemo Arch-Agent 系列進行評估，與常用模型進行比較，截至 2025 年 6 月 14 日的結果如下：

⚠️ 重要提示

評估時，使用 YaRN 縮放技術部署模型進行多輪評估，所有 Arch-Agent 模型的上下文長度均為 64K。

📄 許可證

Arch-Agent 系列模型遵循 Katanemo 許可證進行分發。

🔗 相關鏈接

💬 測試說明

測試內容

正在測試小型開源模型在 AI 網絡監控方面的極限，具體包括：

針對即時網絡服務的函數調用。
模型在處理以下任務時的最小規模：
- 自動化 Nmap 安全掃描。
- 量子就緒性檢查。
- 網絡監控任務。

測試模型

TestLLM（當前實驗模型，在 Hugging Face Docker 空間的 2 個 CPU 線程上運行 llama.cpp）：
- ✅ 零配置設置。
- ⏳ 加載時間 30 秒（推理速度慢，但無 API 成本），由於成本低，無令牌限制。
- 🔧 尋求幫助！如果您對邊緣設備 AI 感興趣，歡迎合作！
TurboLLM（使用 gpt-4.1-mini）：
- 性能出色，但 OpenAI 按令牌收費，因此令牌使用受限。
- 可創建自定義命令處理器，在量子網絡監控代理上運行 .net 代碼。
- 即時網絡診斷和監控。
- 安全審計。
- 滲透測試（Nmap/Metasploit）。
HugLLM（最新開源模型）：
- 🌐 在 Hugging Face 推理 API 上運行，使用 Novita 託管的最新模型，表現良好。

測試命令示例

"Give me info on my websites SSL certificate"
"Check if my server is using quantum safe encyption for communication"
"Run a comprehensive security audit on my server"
'"Create a cmd processor to .. (what ever you want)" 注意，您需要安裝量子網絡監控代理才能運行 .net 代碼，這是一個非常靈活且強大的功能，請謹慎使用！