đ Black-Box Prompt Optimization: Aligning Large Language Models without Model Training
Black-Box Prompt Optimization (BPO) is a technique that aligns large language models without model training, optimizing LLMs through user input optimization.
đ Quick Start
BPO is a black-box alignment technique, distinct from training-based methods (such as PPO or DPO). It only requires training a plug-and-play model and optimizes LLMs by optimizing user inputs. Therefore, it can be applied to various open - source or API - based LLMs.
⨠Features
BPO offers a novel approach to aligning large language models without the need for traditional model training, making it applicable to a wide range of LLMs.
đ Documentation
đĻ Model Details
Data
The Prompt Optimization Model is trained on prompt optimization pairs which contain human preference features. Detailed information on the dataset can be found here.
Backbone Model
The prompt preference optimizer is built on Llama - 2 - 7b - chat - hf
.
Language
English
Performance
Model A |
Model B |
A win |
tie |
B win |
gpt - 3.5 - turbo + BPO |
gpt - 3.5 - turbo |
60.0 |
8.7 |
31.3 |
claude - 2 + BPO |
claude - 2 |
57.5 |
5.0 |
37.5 |
llama - 2 - 13b - chat + BPO |
llama - 2 - 70b - chat |
61.3 |
0.0 |
38.7 |
vicuna - 13b + BPO |
vicuna - 13b + PPO |
52.5 |
3.7 |
43.7 |
vicuna - 13b + BPO |
vicuna - 13b + DPO |
53.8 |
2.5 |
43.7 |
vicuna - 13b + DPO + BPO |
vicuna - 13b + DPO |
60.0 |
2.5 |
37.5 |
đģ Usage Examples
Basic Usage
We adopt a prompt template as
[INST] You are an expert prompt engineer. Please help me improve this prompt to get a more helpful and harmless response:\n{user prompt} [/INST]
Advanced Usage
Here is an example code for inference:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_path = 'Your-Model-Path'
prompt_template = "[INST] You are an expert prompt engineer. Please help me improve this prompt to get a more helpful and harmless response:\n{} [/INST]"
model = AutoModelForCausalLM.from_pretrained(model_path).cuda()
tokenizer = AutoTokenizer.from_pretrained(model_path)
text = 'Tell me about Harry Potter'
prompt = prompt_template.format(text)
model_inputs = tokenizer(prompt, return_tensors="pt").to("cuda:0")
output = model.generate(**model_inputs, max_new_tokens=1024, do_sample=True, top_p=0.9, temperature=0.6, num_beams=1)
resp = tokenizer.decode(output[0], skip_special_tokens=True).split('[/INST]')[1].strip()
print(resp)
See our Github Repo for more detailed usage (e.g. more aggressive optimization).
Other Known Limitations
â ī¸ Important Note
- Task coverage is not sufficient, as we only used open - source data to get about 14k optimized prompts. Clearly, it is impossible to cover a wide range of user queries, so the current model may not perform well on every prompt.
- Due to the small ratio of long - context - based tasks and mathematical problems, the prompt optimizer underperforms when dealing with these tasks.
đ License
No license information provided in the original document, so this section is skipped.
đ Citation
If you find our model is useful in your work, please cite it with:
@article{cheng2023black,
title={Black-Box Prompt Optimization: Aligning Large Language Models without Model Training},
author={Cheng, Jiale and Liu, Xiao and Zheng, Kehan and Ke, Pei and Wang, Hongning and Dong, Yuxiao and Tang, Jie and Huang, Minlie},
journal={arXiv preprint arXiv:2311.04155},
year={2023}
}