xLAM - 7b - rオープンソース大型アクションモデル - ユーザーの意図を操作に変換し、意思決定能力を強化する

ホーム

Xlam 7b R

Salesforceによって開発

xLAM-7b-rはSalesforceが発表した大型アクションモデル（LAMs）シリーズの1つで、ユーザーの意図を実行可能なアクションに変換し、意思決定能力を強化することに特化しています。

大規模言語モデル

Transformers

英語#エージェントの意思決定 #関数呼び出しの最適化 #多輪対話のサポート

ダウンロード数 1,645

リリース時間 : 8/28/2024

モデル概要

xLAMモデルシリーズは、自律的な計画とタスクの実行によって特定の目標を達成することを目的としており、AIエージェントの核心であり、各分野のワークフローを自動化する可能性を持っています。

モデル特徴

関数呼び出しの最適化

関数呼び出しタスクに対して特別に最適化されており、効率的なAPIリクエスト生成とレスポンス解析をサポートします。

多輪対話のサポート

多輪対話とタスク実行をサポートし、コンテキストに応じて動的に動作を調整することができます。

効率的なデプロイ

GGUF量子化ファイルを提供し、リソースが限られたデバイスでも効率的に動作します。

モデル能力

関数呼び出し

タスク計画

多輪対話

自動化ワークフロー

使用事例

スマートアシスタント

天気予報の検索

ユーザーのリクエストに基づいて天気API呼び出しを生成し、結果を返します。

正確なAPIリクエストを生成します。例えば、ニューヨークの天気（華氏）を検索します。

情報検索

ユーザーの入力に基づいて検索リクエストを生成し、インターネット上の情報を取得します。

検索APIリクエストを生成します。例えば、「可愛い子犬」の画像を検索します。

災害情報の検索

山火事情報の検索

ユーザーの入力に基づいて山火事情報APIリクエストを生成します。

APIリクエストを生成します。例えば、カリフォルニア州の山火事情報を検索します。

地震情報の検索

ユーザーの入力に基づいて地震情報APIリクエストを生成します。

APIリクエストを生成します。例えば、カリフォルニア州の地震情報を検索します。

🚀 xLAMモデルファミリー

Large Action Models (LAMs) は、意思決定を強化し、ユーザーの意図を世界と相互作用する実行可能なアクションに変換するために設計された高度な大規模言語モデルです。LAMsは、特定の目標を達成するためにタスクを自律的に計画し、実行し、AIエージェントの中枢として機能します。様々なドメインのワークフロープロセスを自動化する可能性があり、幅広いアプリケーションに役立ちます。

このモデルのリリースは研究目的のみを対象としています。新しく改良されたバージョンのxLAMは、間もなく当社プラットフォームの顧客にのみ提供されます。

xLAM

🚀 クイックスタート

このセクションでは、xLAMモデルの概要、使用方法、および関連情報について説明します。

✨ 主な機能

高度な意思決定支援：Large Action Models (LAMs) は、ユーザーの意図を実行可能なアクションに変換し、自律的にタスクを計画し実行します。
多様なアプリケーション対応：様々なドメインのワークフロープロセスを自動化する可能性を持ち、幅広いアプリケーションに役立ちます。
Function-calling機能：Function-callingシリーズのモデルは、効率的なデプロイと実行のための量子化されたGGUFファイルも提供しています。

📦 インストール

基本的なインストール手順

Huggingfaceからモデルを使用するには、まず transformers ライブラリをインストールしてください。

pip install transformers>=4.41.0

フレームワークバージョン

Transformers 4.41.0
Pytorch 2.3.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1

💻 使用例

基本的な使用法

シングルターンの使用例

import json
import torch 
from transformers import AutoModelForCausalLM, AutoTokenizer

torch.random.manual_seed(0) 

model_name = "Salesforce/xLAM-7b-r"
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(model_name) 

# 最適なパフォーマンスを得るために、提供されているプロンプト形式を使用してください。
task_instruction = """
Based on the previous context and API request history, generate an API request or a response as an AI assistant.""".strip()

format_instruction = """
The output should be of the JSON format, which specifies a list of generated function calls. The example format is as follows, please make sure the parameter type is correct. If no function call is needed, please make 
tool_calls an empty list "[]".

{"thought": "the thought process, or an empty string", "tool_calls": [{"name": "api_name1", "arguments": {"argument1": "value1", "argument2": "value2"}}]}

""".strip()

# 入力クエリと利用可能なツールを定義します。
query = "What's the weather like in New York in fahrenheit?"

get_weather_api = {
    "name": "get_weather",
    "description": "Get the current weather for a location",
    "parameters": {
        "type": "object",
        "properties": {
            "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, New York"
            },
            "unit": {
                "type": "string",
                "enum": ["celsius", "fahrenheit"],
                "description": "The unit of temperature to return"
            }
        },
        "required": ["location"]
    }
}

search_api = {
    "name": "search",
    "description": "Search for information on the internet",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "The search query, e.g. 'latest news on AI'"
            }
        },
        "required": ["query"]
    }
}

openai_format_tools = [get_weather_api, search_api]

# ヘルパー関数を使用して、openai形式のツールをより簡潔なxLAM形式に変換します。
def convert_to_xlam_tool(tools):
    ''''''
    if isinstance(tools, dict):
        return {
            "name": tools["name"],
            "description": tools["description"],
            "parameters": {k: v for k, v in tools["parameters"].get("properties", {}).items()}
        }
    elif isinstance(tools, list):
        return [convert_to_xlam_tool(tool) for tool in tools]
    else:
        return tools

def build_conversation_history_prompt(conversation_history: str):
    parsed_history = []
    for step_data in conversation_history:
        parsed_history.append({
            "step_id": step_data["step_id"],
            "thought": step_data["thought"],
            "tool_calls": step_data["tool_calls"],
            "next_observation": step_data["next_observation"],
            "user_input": step_data['user_input']
        })
        
    history_string = json.dumps(parsed_history)
    return f"
[BEGIN OF HISTORY STEPS]
{history_string}
[END OF HISTORY STEPS]
"
    
    
# ヘルパー関数を使用して、モデルの入力プロンプトを構築します。
def build_prompt(task_instruction: str, format_instruction: str, tools: list, query: str):
    prompt = f"[BEGIN OF TASK INSTRUCTION]\n{task_instruction}\n[END OF TASK INSTRUCTION]\n\n"
    prompt += f"[BEGIN OF AVAILABLE TOOLS]\n{json.dumps(xlam_format_tools)}\n[END OF AVAILABLE TOOLS]\n\n"
    prompt += f"[BEGIN OF FORMAT INSTRUCTION]\n{format_instruction}\n[END OF FORMAT INSTRUCTION]\n\n"
    prompt += f"[BEGIN OF QUERY]\n{query}\n[END OF QUERY]\n\n"
    return prompt

    
# 入力を構築し、推論を開始します。
xlam_format_tools = convert_to_xlam_tool(openai_format_tools)

conversation_history = []
content = build_prompt(task_instruction, format_instruction, xlam_format_tools, query, conversation_history)

messages=[
    { 'role': 'user', 'content': content}
]

inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)

# tokenizer.eos_token_id は <|EOT|> トークンのIDです。
outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
agent_action = tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True)

このコードを実行すると、次のようなJSON形式の出力が得られます。

{"thought": "I need to get the current weather for New York in fahrenheit.", "tool_calls": [{"name": "get_weather", "arguments": {"location": "New York", "unit": "fahrenheit"}}]}

マルチターンの使用例

def parse_agent_action(agent_action: str):
    """
    エージェントのアクションを解析し、会話履歴に追加します。
    """
    try: parsed_agent_action_json = json.loads(agent_action)
    except: return "", []
    
    if "thought" not in parsed_agent_action_json.keys(): thought = ""
    else: thought = parsed_agent_action_json["thought"]
    
    if "tool_calls" not in parsed_agent_action_json.keys(): tool_calls = []
    else: tool_calls = parsed_agent_action_json["tool_calls"]
    
    return thought, tool_calls

def update_conversation_history(conversation_history: list, agent_action: str, environment_response: str, user_input: str):
    """
    新しいエージェントのアクション、環境の応答、および/またはユーザー入力に基づいて、会話履歴リストを更新します。
    """
    thought, tool_calls = parse_agent_action(agent_action)
    new_step_data = {
        "step_id": len(conversation_history) + 1,
        "thought": thought,
        "tool_calls": tool_calls,
        "step_id": len(conversation_history),
        "next_observation": environment_response,
        "user_input": user_input,
    }
    
    conversation_history.append(new_step_data)

def get_environment_response(agent_action: str):
    """
    エージェントのアクションに対する環境の応答を取得します。
    """
    # TODO: ここにカスタム実装を追加してください。
    error_message, response_message = "", ""
    return {"error": error_message, "response": response_message}

# ------------- 上記の例からエージェントの応答を取得する手順はここまで ----------

# 1. エージェントの応答後の次の状態を取得します。
#   次の2行は、環境の応答とユーザー入力を取得する例です。
#   特定の使用法に依存し、どちらか一方または両方を持つことができます。
environment_response = get_environment_response(agent_action)
user_input = "Now, search on the Internet for cute puppies"

# 2. 環境の応答と（または）ユーザー入力を取得した後、会話履歴に追加します。
update_conversation_history(conversation_history, agent_action, environment_response, user_input)

# 3. ここで、プロンプトを構築することができます。
content = build_prompt(task_instruction, format_instruction, xlam_format_tools, query, conversation_history)

# 4. 次に、LLMの入力を取得します。
messages=[
    { 'role': 'user', 'content': content}
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)

# 5. 出力を生成し、デコードします。
#   tokenizer.eos_token_id は <|EOT|> トークンのIDです。
outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
agent_action = tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True)

このコードを実行すると、次のような出力が得られます。

{"thought": "I need to get the current weather for New York in fahrenheit.", "tool_calls": [{"name": "get_weather", "arguments": {"location": "New York", "unit": "fahrenheit"}}]}

マルチターンのプロンプトと出力の例

プロンプト

[BEGIN OF TASK INSTRUCTION]
Based on the previous context and API request history, generate an API request or a response as an AI assistant. 
[END OF TASK INSTRUCTION]

[BEGIN OF AVAILABLE TOOLS]
[
    {
        "name": "get_fire_info",
        "description": "Query the latest wildfire information",
        "parameters": {
            "location": {
                "type": "string",
                "description": "Location of the wildfire, for example: 'California'",
                "required": true,
                "format": "free"
            },
            "radius": {
                "type": "number",
                "description": "The radius (in miles) around the location where the wildfire is occurring, for example: 10",
                "required": false,
                "format": "free"
            }
        }
    },
    {
        "name": "get_hurricane_info",
        "description": "Query the latest hurricane information",
        "parameters": {
            "name": {
                "type": "string",
                "description": "Name of the hurricane, for example: 'Irma'",
                "required": true,
                "format": "free"
            }
        }
    },
    {
        "name": "get_earthquake_info",
        "description": "Query the latest earthquake information",
        "parameters": {
            "magnitude": {
                "type": "number",
                "description": "The minimum magnitude of the earthquake that needs to be queried.",
                "required": false,
                "format": "free"
            },
            "location": {
                "type": "string",
                "description": "Location of the earthquake, for example: 'California'",
                "required": false,
                "format": "free"
            }
        }
    }
]
[END OF AVAILABLE TOOLS]

[BEGIN OF FORMAT INSTRUCTION]
Your output should be in the JSON format, which specifies a list of function calls. The example format is as follows. Please make sure the parameter type is correct. If no function call is needed, please make tool_calls an empty list '[]'.
```{"thought": "the thought process, or an empty string", "tool_calls": [{"name": "api_name1", "arguments": {"argument1": "value1", "argument2": "value2"}}]}```
[END OF FORMAT INSTRUCTION]

[BEGIN OF QUERY]
User: Can you give me the latest information on the wildfires occurring in California?
[END OF QUERY]

[BEGIN OF HISTORY STEPS]
[
    {
        "thought": "Sure, what is the radius (in miles) around the location of the wildfire?",
        "tool_calls": [],
        "step_id": 1,
        "next_observation": "",
        "user_input": "User: Let me think... 50 miles."
    },
    {
        "thought": "",
        "tool_calls": [
            {
                "name": "get_fire_info",
                "arguments": {
                    "location": "California",
                    "radius": 50
                }
            }
        ],
        "step_id": 2,
        "next_observation": [
            {
                "location": "Los Angeles",
                "acres_burned": 1500,
                "status": "contained"
            },
            {
                "location": "San Diego",
                "acres_burned": 12000,
                "status": "active"
            }
        ]
    },
    {
        "thought": "Based on the latest information, there are wildfires in Los Angeles and San Diego. The wildfire in Los Angeles has burned 1,500 acres and is contained, while the wildfire in San Diego has burned 12,000 acres and is still active.",
        "tool_calls": [],
        "step_id": 3,
        "next_observation": "",
        "user_input": "User: Can you tell me about the latest earthquake?"
    }
]

[END OF HISTORY STEPS]

出力

{"thought": "", "tool_calls": [{"name": "get_earthquake_info", "arguments": {"location": "California"}}]}

📚 ドキュメント

モデルシリーズ

モデル名	総パラメータ数	コンテキスト長	リリース日	カテゴリ	モデルダウンロード	GGUFファイルダウンロード
xLAM-7b-r	7.24B	32k	2024年9月5日	汎用、Function-calling	🤗 リンク	--
xLAM-8x7b-r	46.7B	32k	2024年9月5日	汎用、Function-calling	🤗 リンク	--
xLAM-8x22b-r	141B	64k	2024年9月5日	汎用、Function-calling	🤗 リンク	--
xLAM-1b-fc-r	1.35B	16k	2024年7月17日	Function-calling	🤗 リンク	🤗 リンク
xLAM-7b-fc-r	6.91B	4k	2024年7月17日	Function-calling	🤗 リンク	🤗 リンク
xLAM-v0.1-r	46.7B	32k	2024年3月18日	汎用、Function-calling	🤗 リンク	--

Function-callingシリーズについては、こちらに詳細があり、効率的なデプロイと実行のための量子化されたGGUFファイルも提供しています。GGUFは、大規模言語モデルを効率的に保存および読み込むために設計されたファイル形式で、限られたリソースのローカルデバイスでAIモデルを実行するのに最適で、オフライン機能とプライバシーの向上を実現します。

詳細については、GitHubと論文を確認してください。

xLAMとの最新の対話例

xLAMモデルとの対話に関する最新の例とトークナイザーがこちらにあります。

リポジトリ概要

このリポジトリは、汎用ツール使用シリーズに関するものです。より専門的なFunction-callingモデルについては、fc シリーズをこちらで確認してください。

このガイドでは、モデルシリーズのセットアップ、使用方法、およびHuggingFaceとの統合について説明します。

🔧 技術詳細

ベンチマーク結果

Berkeley Function-Calling Leaderboard (BFCL)

xlam-bfcl 表1: BFCL-v2リーダーボードでのパフォーマンス比較（締切日2024年9月3日）。ランクは、さまざまな評価カテゴリの加重平均である全体的な精度に基づいています。「FC」は、Function-callingモードを指し、カスタム「プロンプト」を使用してFunction-callingを抽出するモードとは対照的です。

WebshopとToolQuery

xlam-webshop_toolquery 表2: WebshopとToolQueryでのテスト結果。太字と下線付きの結果は、それぞれ成功率の最高結果と2番目に高い結果を示しています。

Unified ToolQuery

xlam-unified_toolquery 表3: ToolQuery-Unifiedでのテスト結果。太字と下線付きの結果は、それぞれ成功率の最高結果と2番目に高い結果を示しています。括弧内の値は、ToolQueryでの対応するパフォーマンスを示しています。

ToolBench

xlam-toolbench 表4: ToolBenchでの3つの異なるシナリオでの合格率。太字と下線付きの結果は、それぞれ各設定での最高結果と2番目に高い結果を示しています。xLAM-8x22b-rの結果は、2024年7月28日から評価締切日の2024年9月3日までToolBenchサーバーがダウンしていたため、利用できません。

📄 ライセンス

このモデルはCC-BY-NC-4.0ライセンスの下で配布されています。

倫理的な考慮事項

このリリースは、学術論文をサポートするための研究目的のみを対象としています。当社のモデル、データセット、およびコードは、すべての下流の目的に特に設計または評価されているわけではありません。ユーザーは、このモデルを展開する前に、精度、安全性、および公正性に関する潜在的な懸念事項を評価し、対処することを強くお勧めします。ユーザーは、AIの一般的な制限を考慮し、適用可能な法律を遵守し、特にエラーや誤用が人々の生活、権利、または安全に重大な影響を与える可能性のある高リスクシナリオの場合、ユースケースを選択する際にベストプラクティスを活用することをお勧めします。ユースケースに関する詳細なガイダンスについては、当社のAUPとAI AUPを参照してください。

引用

このリポジトリが役立った場合は、論文を引用していただけると助かります。

@article{zhang2024xlam,
  title={xLAM: A Family of Large Action Models to Empower AI Agent Systems},
  author={Zhang, Jianguo and Lan, Tian and Zhu, Ming and Liu, Zuxin and Hoang, Thai and Kokane, Shirley and Yao, Weiran and Tan, Juntao and Prabhakar, Akshara and Chen, Haolin and others},
  journal={arXiv preprint arXiv:2409.03215},
  year={2024}
}