xLAM-7b-r開源大型動作模型 - 將用戶意圖轉操作，強化決策能力

首頁

Xlam 7b R

由Salesforce開發

xLAM-7b-r是Salesforce發佈的大型動作模型（LAMs）系列之一，專注於將用戶意圖轉化為可執行操作，強化決策能力。

大型語言模型

Transformers

英語#智能體決策 #函數調用優化 #多輪交互支持

下載量 1,645

發布時間 : 8/28/2024

模型概述

xLAM模型系列旨在通過自主規劃和執行任務來達成特定目標，是AI智能體的核心，具有自動化各領域工作流程的潛力。

模型特點

函數調用優化

針對函數調用任務進行了專門優化，支持高效的API請求生成和響應解析。

多輪交互支持

支持多輪對話和任務執行，能夠根據上下文動態調整行為。

高效部署

提供GGUF量化文件，支持在資源有限的設備上高效運行。

模型能力

函數調用

任務規劃

多輪對話

自動化工作流程

使用案例

智能助手

天氣查詢

根據用戶請求生成天氣API調用，並返回結果。

生成準確的API請求，如查詢紐約的天氣（華氏度）。

信息搜索

根據用戶輸入生成搜索請求，獲取互聯網信息。

生成搜索API請求，如查詢“可愛小狗”的圖片。

災害信息查詢

野火信息查詢

根據用戶輸入生成野火信息API請求。

生成API請求，如查詢加利福尼亞州的野火信息。

地震信息查詢

根據用戶輸入生成地震信息API請求。

生成API請求，如查詢加利福尼亞州的地震信息。

🚀 xLAM大動作模型系列

xLAM模型系列屬於大型動作模型（LAMs），旨在強化決策能力，將用戶意圖轉化為與現實世界交互的可執行操作。這些模型能夠自主規劃並執行任務以達成特定目標，是AI智能體的核心，具有自動化各領域工作流程的潛力，在眾多應用場景中價值巨大。

本次模型發佈僅用於研究目的。全新且更強大的xLAM版本即將僅面向我們平臺的客戶推出。

xLAM

[主頁] | [論文] | [Github] | [Discord] | [博客] | [社區演示]

📦 模型系列

我們提供了不同規模的xLAM模型系列，以滿足各種應用需求，包括針對函數調用和通用智能體應用優化的模型：

模型名稱	總參數數量	上下文長度	發佈日期	類別	模型下載鏈接	GGUF文件下載鏈接
xLAM-7b-r	72.4億	32k	2024年9月5日	通用、函數調用	🤗 鏈接	--
xLAM-8x7b-r	467億	32k	2024年9月5日	通用、函數調用	🤗 鏈接	--
xLAM-8x22b-r	1410億	64k	2024年9月5日	通用、函數調用	🤗 鏈接	--
xLAM-1b-fc-r	13.5億	16k	2024年7月17日	函數調用	🤗 鏈接	🤗 鏈接
xLAM-7b-fc-r	69.1億	4k	2024年7月17日	函數調用	🤗 鏈接	🤗 鏈接
xLAM-v0.1-r	467億	32k	2024年3月18日	通用、函數調用	🤗 鏈接	--

對於我們的函數調用系列模型（更多細節見此處），我們還提供了量化後的GGUF文件，以實現高效部署和執行。GGUF是一種專門設計用於高效存儲和加載大語言模型的文件格式，非常適合在資源有限的本地設備上運行AI模型，支持離線功能並增強隱私保護。

更多詳細信息，請查看我們的GitHub和論文。

🔍 查看與xLAM交互的最新示例

這裡有與xLAM模型交互的最新示例和分詞器。

📂 倉庫概述

本倉庫主要涉及通用工具使用系列。如需更專業的函數調用模型，請查看我們的fc系列此處。

本指南將指導您完成模型系列在HuggingFace上的設置、使用和集成。

框架版本

Transformers 4.41.0
Pytorch 2.3.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1

💻 使用方法

使用Huggingface的基本用法

要從Huggingface使用該模型，請先安裝transformers庫：

pip install transformers>=4.41.0

請注意，我們的模型在使用我們提供的提示格式時效果最佳。這種格式使我們能夠提取類似於ChatGPT函數調用模式的JSON輸出。

我們使用以下示例來說明如何使用我們的模型進行1) 單輪使用場景和2) 多輪使用場景。

1. 單輪使用場景

import json
import torch 
from transformers import AutoModelForCausalLM, AutoTokenizer

torch.random.manual_seed(0) 

model_name = "Salesforce/xLAM-7b-r"
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_name) 

# Please use our provided instruction prompt for best performance
task_instruction = """
Based on the previous context and API request history, generate an API request or a response as an AI assistant.""".strip()

format_instruction = """
The output should be of the JSON format, which specifies a list of generated function calls. The example format is as follows, please make sure the parameter type is correct. If no function call is needed, please make 
tool_calls an empty list "[]".

{"thought": "the thought process, or an empty string", "tool_calls": [{"name": "api_name1", "arguments": {"argument1": "value1", "argument2": "value2"}}]}

""".strip()

# Define the input query and available tools
query = "What's the weather like in New York in fahrenheit?"

get_weather_api = {
    "name": "get_weather",
    "description": "Get the current weather for a location",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, New York"
            },
            "unit": {
                "type": "string",
                "enum": ["celsius", "fahrenheit"],
                "description": "The unit of temperature to return"
            }
        },
        "required": ["location"]
    }
}

search_api = {
    "name": "search",
    "description": "Search for information on the internet",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "The search query, e.g. 'latest news on AI'"
            }
        },
        "required": ["query"]
    }
}

openai_format_tools = [get_weather_api, search_api]

# Helper function to convert openai format tools to our more concise xLAM format
def convert_to_xlam_tool(tools):
    ''''''
    if isinstance(tools, dict):
        return {
            "name": tools["name"],
            "description": tools["description"],
            "parameters": {k: v for k, v in tools["parameters"].get("properties", {}).items()}
        }
    elif isinstance(tools, list):
        return [convert_to_xlam_tool(tool) for tool in tools]
    else:
        return tools

def build_conversation_history_prompt(conversation_history: str):
    parsed_history = []
    for step_data in conversation_history:
        parsed_history.append({
            "step_id": step_data["step_id"],
            "thought": step_data["thought"],
            "tool_calls": step_data["tool_calls"],
            "next_observation": step_data["next_observation"],
            "user_input": step_data['user_input']
        })
        
    history_string = json.dumps(parsed_history)
    return f"
[BEGIN OF HISTORY STEPS]
{history_string}
[END OF HISTORY STEPS]
"
    
    
# Helper function to build the input prompt for our model
def build_prompt(task_instruction: str, format_instruction: str, tools: list, query: str):
    prompt = f"[BEGIN OF TASK INSTRUCTION]\n{task_instruction}\n[END OF TASK INSTRUCTION]\n\n"
    prompt += f"[BEGIN OF AVAILABLE TOOLS]\n{json.dumps(xlam_format_tools)}\n[END OF AVAILABLE TOOLS]\n\n"
    prompt += f"[BEGIN OF FORMAT INSTRUCTION]\n{format_instruction}\n[END OF FORMAT INSTRUCTION]\n\n"
    prompt += f"[BEGIN OF QUERY]\n{query}\n[END OF QUERY]\n\n"
    return prompt

    
# Build the input and start the inference
xlam_format_tools = convert_to_xlam_tool(openai_format_tools)

conversation_history = []
content = build_prompt(task_instruction, format_instruction, xlam_format_tools, query, conversation_history)

messages=[
    { 'role': 'user', 'content': content}
]

inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)

# tokenizer.eos_token_id is the id of <|EOT|> token
outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
agent_action = tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True)

然後，您應該能夠看到以下JSON格式的輸出字符串：

{"thought": "I need to get the current weather for New York in fahrenheit.", "tool_calls": [{"name": "get_weather", "arguments": {"location": "New York", "unit": "fahrenheit"}}]}

2. 多輪使用場景

我們的模型系列也支持多輪交互。以下是上述示例的下一輪交互示例：

def parse_agent_action(agent_action: str):
    """
    Given an agent's action, parse it to add to conversation history
    """
    try: parsed_agent_action_json = json.loads(agent_action)
    except: return "", []
    
    if "thought" not in parsed_agent_action_json.keys(): thought = ""
    else: thought = parsed_agent_action_json["thought"]
    
    if "tool_calls" not in parsed_agent_action_json.keys(): tool_calls = []
    else: tool_calls = parsed_agent_action_json["tool_calls"]
    
    return thought, tool_calls

def update_conversation_history(conversation_history: list, agent_action: str, environment_response: str, user_input: str):
    """
    Update the conversation history list based on the new agent_action, environment_response, and/or user_input
    """
    thought, tool_calls = parse_agent_action(agent_action)
    new_step_data = {
        "step_id": len(conversation_history) + 1,
        "thought": thought,
        "tool_calls": tool_calls,
        "step_id": len(conversation_history),
        "next_observation": environment_response,
        "user_input": user_input,
    }
    
    conversation_history.append(new_step_data)

def get_environment_response(agent_action: str):
    """
    Get the environment response for the agent_action
    """
    # TODO: add custom implementation here
    error_message, response_message = "", ""
    return {"error": error_message, "response": response_message}

# ------------- before here are the steps to get agent_response from the example above ----------

# 1. get the next state after agent's response:
#   The next 2 lines are examples of getting environment response and user_input.
#   It is depended on particular usage, we can have either one or both of those.
environment_response = get_environment_response(agent_action)
user_input = "Now, search on the Internet for cute puppies"

# 2. after we got environment_response and (or) user_input, we want to add to our conversation history
update_conversation_history(conversation_history, agent_action, environment_response, user_input)

# 3. we now can build the prompt
content = build_prompt(task_instruction, format_instruction, xlam_format_tools, query, conversation_history)

# 4. Now, we just retrieve the inputs for the LLM
messages=[
    { 'role': 'user', 'content': content}
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)

# 5. Generate the outputs & decode
#   tokenizer.eos_token_id is the id of <|EOT|> token
outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
agent_action = tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True)

相應的輸出如下：

{"thought": "I need to get the current weather for New York in fahrenheit.", "tool_calls": [{"name": "get_weather", "arguments": {"location": "New York", "unit": "fahrenheit"}}]}

我們強烈建議使用我們提供的提示格式和輔助函數，以獲得我們模型最佳的函數調用性能。

多輪提示和輸出示例

提示：

[BEGIN OF TASK INSTRUCTION]
Based on the previous context and API request history, generate an API request or a response as an AI assistant. 
[END OF TASK INSTRUCTION]

[BEGIN OF AVAILABLE TOOLS]
[
    {
        "name": "get_fire_info",
        "description": "Query the latest wildfire information",
        "parameters": {
            "location": {
                "type": "string",
                "description": "Location of the wildfire, for example: 'California'",
                "required": true,
                "format": "free"
            },
            "radius": {
                "type": "number",
                "description": "The radius (in miles) around the location where the wildfire is occurring, for example: 10",
                "required": false,
                "format": "free"
            }
        }
    },
    {
        "name": "get_hurricane_info",
        "description": "Query the latest hurricane information",
        "parameters": {
            "name": {
                "type": "string",
                "description": "Name of the hurricane, for example: 'Irma'",
                "required": true,
                "format": "free"
            }
        }
    },
    {
        "name": "get_earthquake_info",
        "description": "Query the latest earthquake information",
        "parameters": {
            "magnitude": {
                "type": "number",
                "description": "The minimum magnitude of the earthquake that needs to be queried.",
                "required": false,
                "format": "free"
            },
            "location": {
                "type": "string",
                "description": "Location of the earthquake, for example: 'California'",
                "required": false,
                "format": "free"
            }
        }
    }
]
[END OF AVAILABLE TOOLS]

[BEGIN OF FORMAT INSTRUCTION]
Your output should be in the JSON format, which specifies a list of function calls. The example format is as follows. Please make sure the parameter type is correct. If no function call is needed, please make tool_calls an empty list '[]'.
```{"thought": "the thought process, or an empty string", "tool_calls": [{"name": "api_name1", "arguments": {"argument1": "value1", "argument2": "value2"}}]}```
[END OF FORMAT INSTRUCTION]

[BEGIN OF QUERY]
User: Can you give me the latest information on the wildfires occurring in California?
[END OF QUERY]

[BEGIN OF HISTORY STEPS]
[
    {
        "thought": "Sure, what is the radius (in miles) around the location of the wildfire?",
        "tool_calls": [],
        "step_id": 1,
        "next_observation": "",
        "user_input": "User: Let me think... 50 miles."
    },
    {
        "thought": "",
        "tool_calls": [
            {
                "name": "get_fire_info",
                "arguments": {
                    "location": "California",
                    "radius": 50
                }
            }
        ],
        "step_id": 2,
        "next_observation": [
            {
                "location": "Los Angeles",
                "acres_burned": 1500,
                "status": "contained"
            },
            {
                "location": "San Diego",
                "acres_burned": 12000,
                "status": "active"
            }
        ]
    },
    {
        "thought": "Based on the latest information, there are wildfires in Los Angeles and San Diego. The wildfire in Los Angeles has burned 1,500 acres and is contained, while the wildfire in San Diego has burned 12,000 acres and is still active.",
        "tool_calls": [],
        "step_id": 3,
        "next_observation": "",
        "user_input": "User: Can you tell me about the latest earthquake?"
    }
]

[END OF HISTORY STEPS]

輸出：

{"thought": "", "tool_calls": [{"name": "get_earthquake_info", "arguments": {"location": "California"}}]}

📊 基準測試結果

注意：加粗和下劃線結果分別表示成功率的最佳結果和第二佳結果。

伯克利函數調用排行榜（BFCL）

xlam-bfcl 表1：BFCL-v2排行榜上的性能比較（截止日期2024年9月3日）。排名基於整體準確率，即不同評估類別的加權平均值。“FC”表示函數調用模式，與使用自定義“提示”提取函數調用形成對比。

Webshop和ToolQuery

xlam-webshop_toolquery 表2：Webshop和ToolQuery上的測試結果。加粗和下劃線結果分別表示成功率的最佳結果和第二佳結果。

統一ToolQuery

xlam-unified_toolquery 表3：ToolQuery-Unified上的測試結果。加粗和下劃線結果分別表示成功率的最佳結果和第二佳結果。括號中的值表示在ToolQuery上的相應性能。

ToolBench

xlam-toolbench 表4：ToolBench在三種不同場景下的通過率。加粗和下劃線結果分別表示每個設置的最佳結果和第二佳結果。由於ToolBench服務器在2024年7月28日至我們的評估截止日期2024年9月3日期間停機，xLAM-8x22b-r的結果不可用。

📄 許可證

該模型遵循CC-BY-NC-4.0許可證分發。

⚠️ 倫理考量

本次發佈僅用於支持學術論文的研究目的。我們的模型、數據集和代碼並非專門為所有下游用途設計或評估。我們強烈建議用戶在部署此模型之前，評估並解決與準確性、安全性和公平性相關的潛在問題。我們鼓勵用戶考慮AI的常見侷限性，遵守適用法律，並在選擇用例時採用最佳實踐，特別是在錯誤或濫用可能對人們的生活、權利或安全產生重大影響的高風險場景中。有關用例的更多指導，請參考我們的AUP和AI AUP。

📝 引用信息

如果您覺得這個倉庫有幫助，請考慮引用我們的論文：

@article{zhang2024xlam,
  title={xLAM: A Family of Large Action Models to Empower AI Agent Systems},
  author={Zhang, Jianguo and Lan, Tian and Zhu, Ming and Liu, Zuxin and Hoang, Thai and Kokane, Shirley and Yao, Weiran and Tan, Juntao and Prabhakar, Akshara and Chen, Haolin and others},
  journal={arXiv preprint arXiv:2409.03215},
  year={2024}
}