CodeActAgent-Mistral-7b-v0.1 Open-source Code Agent - Enhance Code Task Execution Capability!

Codeactagent Mistral 7b V0.1

Developed by xingyaoww

CodeActAgent is an LLM agent based on executable Python code actions, enhancing task execution capabilities through a unified action space.

Large Language Model

Transformers

EnglishOpen Source License:Apache-2.0 #Code-driven Agent #Multi-round Interaction Optimization #Python Execution Integration

Downloads 117

Release Time : 1/5/2024

Model Overview

This model uses executable Python code as its action space (CodeAct), integrating with a Python interpreter to perform multi-round interactive tasks, significantly improving the agent's execution success rate.

Model Features

Executable Code Actions

Uses Python code to unify the action space, supporting dynamic modifications and response to execution results.

Multi-round Interaction Capability

Achieves multi-round interaction for complex tasks through feedback from code execution results.

High Performance

Achieves a 20% higher success rate than text/JSON action spaces on API-Bank and M3ToolEval benchmarks.

Model Capabilities

Code generation and execution

Multi-round task planning

Dynamic action adjustment

Tool usage integration

Use Cases

Agent Development

API Call Automation

Automates the calling and combining of multiple APIs to complete tasks through code actions.

Performs excellently on the API-Bank benchmark.

Complex Task Decomposition

Breaks down complex tasks into executable code steps.

Significantly improves success rate on the M3ToolEval benchmark.

🚀 Executable Code Actions Elicit Better LLM Agents

This project proposes using executable Python code to consolidate LLM agents’ actions into a unified action space (CodeAct). Integrated with a Python interpreter, CodeAct can execute code actions and dynamically revise prior actions or emit new actions based on new observations.

🚀 Quick Start

The project provides various resources for users to explore:

✨ Features

CodeAct Concept

We propose to use executable Python code to consolidate LLM agents’ actions into a unified action space (CodeAct). Integrated with a Python interpreter, CodeAct can execute code actions and dynamically revise prior actions or emit new actions upon new observations (e.g., code execution results) through multi-turn interactions.

Overview

Why CodeAct?

Our extensive analysis of 17 LLMs on API - Bank and a newly curated benchmark M³ToolEval shows that CodeAct outperforms widely used alternatives like Text and JSON (up to 20% higher success rate).

Comparison between CodeAct and Text / JSON as action.

Comparison between CodeAct and Text/JSON Quantitative results comparing CodeAct and {Text, JSON} on M³ToolEval.

CodeActInstruct

We collect an instruction - tuning dataset CodeActInstruct that consists of 7k multi - turn interactions using CodeAct. The dataset is released at huggingface dataset 🤗.

Data Statistics Dataset Statistics. Token statistics are computed using Llama - 2 tokenizer.

CodeActAgent

Trained on CodeActInstruct and general conversations, CodeActAgent excels at out - of - domain agent tasks compared to open - source models of the same size, while not sacrificing generic performance (e.g., knowledge, dialog). We release two variants of CodeActAgent:

CodeActAgent - Mistral - 7b - v0.1 (recommended, model link): using Mistral - 7b - v0.1 as the base model with 32k context window.
CodeActAgent - Llama - 7b (model link): using Llama - 2 - 7b as the base model with 4k context window.

Model Performance Evaluation results for CodeActAgent. ID and OD stand for in - domain and out - of - domain evaluation correspondingly. Overall averaged performance normalizes the MT - Bench score to be consistent with other tasks and excludes in - domain tasks for fair comparison.

📚 Documentation

Please check out our paper and code for more details about data collection, model training, and evaluation.

📄 License

The project is under the Apache - 2.0 license.

📚 Citation

@misc{wang2024executable,
      title={Executable Code Actions Elicit Better LLM Agents}, 
      author={Xingyao Wang and Yangyi Chen and Lifan Yuan and Yizhe Zhang and Yunzhu Li and Hao Peng and Heng Ji},
      year={2024},
      eprint={2402.01030},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Property	Details
Pipeline Tag	Text Generation
Tags	LLM - Agent
License	Apache - 2.0
Datasets	xingyaoww/code - act

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご