BPO Open-Source Technology - Improve Model Output Quality by Optimizing Prompts without Training Large Models

BPO

Developed by THUDM

BPO is a training-free black-box alignment technique that improves model output quality by optimizing user input prompts.

Large Language Model

Transformers

English#Prompt Optimization #Training-Free Alignment #Multi-Model Compatibility

Downloads 155

Release Time : 11/20/2023

Model Overview

BPO is a black-box alignment technique distinct from traditional training methods, requiring only plug-and-play model training to optimize user input, applicable to various open-source or API-based large language models.

Model Features

No Model Training Required

Improves large language model outputs solely by optimizing user input prompts, without training the base model.

Broad Applicability

Applicable to various open-source or API-based large language models, including GPT-3.5, Claude-2, etc.

Significant Performance Improvement

Experiments show it can significantly enhance output quality across multiple models, with win rates generally exceeding 50%.

Model Capabilities

Prompt Optimization

Large Language Model Alignment

Text Generation Improvement

Use Cases

Large Language Model Applications

GPT-3.5 Output Optimization

Using BPO to optimize GPT-3.5 input prompts for superior outputs

Achieves a 60% win rate compared to original GPT-3.5

Claude-2 Output Enhancement

Optimizing Claude-2 input prompts via BPO

Post-optimization win rate reaches 57.5%

🚀 Black-Box Prompt Optimization: Aligning Large Language Models without Model Training

Black-Box Prompt Optimization (BPO) is a technique that aligns large language models without model training, optimizing LLMs through user input optimization.

🚀 Quick Start

BPO is a black-box alignment technique, distinct from training-based methods (such as PPO or DPO). It only requires training a plug-and-play model and optimizes LLMs by optimizing user inputs. Therefore, it can be applied to various open - source or API - based LLMs.

Repository: https://github.com/thu-coai/BPO
Paper: https://arxiv.org/abs/2311.04155
Data: https://huggingface.co/datasets/THUDM/BPO

✨ Features

BPO offers a novel approach to aligning large language models without the need for traditional model training, making it applicable to a wide range of LLMs.

📚 Documentation

📦 Model Details

Data

The Prompt Optimization Model is trained on prompt optimization pairs which contain human preference features. Detailed information on the dataset can be found here.

Backbone Model

The prompt preference optimizer is built on Llama - 2 - 7b - chat - hf.

Language

English

Performance

Model A	Model B	A win	tie	B win
gpt - 3.5 - turbo + BPO	gpt - 3.5 - turbo	60.0	8.7	31.3
claude - 2 + BPO	claude - 2	57.5	5.0	37.5
llama - 2 - 13b - chat + BPO	llama - 2 - 70b - chat	61.3	0.0	38.7
vicuna - 13b + BPO	vicuna - 13b + PPO	52.5	3.7	43.7
vicuna - 13b + BPO	vicuna - 13b + DPO	53.8	2.5	43.7
vicuna - 13b + DPO + BPO	vicuna - 13b + DPO	60.0	2.5	37.5

💻 Usage Examples

Basic Usage

We adopt a prompt template as

[INST] You are an expert prompt engineer. Please help me improve this prompt to get a more helpful and harmless response:\n{user prompt} [/INST]

Advanced Usage

Here is an example code for inference:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = 'Your-Model-Path'

prompt_template = "[INST] You are an expert prompt engineer. Please help me improve this prompt to get a more helpful and harmless response:\n{} [/INST]"

model = AutoModelForCausalLM.from_pretrained(model_path).cuda()
tokenizer = AutoTokenizer.from_pretrained(model_path)

text = 'Tell me about Harry Potter'

prompt = prompt_template.format(text)
model_inputs = tokenizer(prompt, return_tensors="pt").to("cuda:0")
output = model.generate(**model_inputs, max_new_tokens=1024, do_sample=True, top_p=0.9, temperature=0.6, num_beams=1)
resp = tokenizer.decode(output[0], skip_special_tokens=True).split('[/INST]')[1].strip()

print(resp)

See our Github Repo for more detailed usage (e.g. more aggressive optimization).

Other Known Limitations

⚠️ Important Note

Task coverage is not sufficient, as we only used open - source data to get about 14k optimized prompts. Clearly, it is impossible to cover a wide range of user queries, so the current model may not perform well on every prompt.

Due to the small ratio of long - context - based tasks and mathematical problems, the prompt optimizer underperforms when dealing with these tasks.

📄 License

No license information provided in the original document, so this section is skipped.

📚 Citation

If you find our model is useful in your work, please cite it with:

@article{cheng2023black,
  title={Black-Box Prompt Optimization: Aligning Large Language Models without Model Training},
  author={Cheng, Jiale and Liu, Xiao and Zheng, Kehan and Ke, Pei and Wang, Hongning and Dong, Yuxiao and Tang, Jie and Huang, Minlie},
  journal={arXiv preprint arXiv:2311.04155},
  year={2023}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご