đ Self-RAG 7B Model
This 7B Self-RAG model can generate outputs for various user queries. It also uses reflection tokens to adaptively call the retrieval system, and can self - critique its outputs and retrieved passages.
đ Quick Start
This model is a 7B Self-RAG model that can generate responses to a wide range of user queries. Additionally, it uses reflection tokens to adaptively invoke the retrieval system and self - evaluate its outputs and retrieved passages.
Self-RAG is trained on our instruction - following corpora with interleaved passages and reflection tokens using the standard next - token prediction objective. This enables efficient and stable learning with fine - grained feedback. At inference time, we utilize reflection tokens covering various aspects of generations to sample the best output that aligns with users' preferences. For full details, refer to our paper.
⨠Features
- Generate diverse outputs for user queries.
- Use reflection tokens for adaptive retrieval and self - critique.
- Efficient and stable learning with fine - grained feedback during training.
- Sampling of best outputs according to user preferences at inference.
đĻ Installation
Make sure to install the dependencies listed at self-rag/requirements.txt.
đģ Usage Examples
Basic Usage
Here is an easy way to quickly download our model from HuggingFace and run it with vllm
using pre - given passages.
from transformers import AutoTokenizer, AutoModelForCausalLM
from vllm import LLM, SamplingParams
model = LLM("selfrag/selfrag_llama2_7b", download_dir="/gscratch/h2lab/akari/model_cache", dtype="half")
sampling_params = SamplingParams(temperature=0.0, top_p=1.0, max_tokens=100, skip_special_tokens=False)
def format_prompt(input, paragraph=None):
prompt = "### Instruction:\n{0}\n\n### Response:\n".format(input)
if paragraph is not None:
prompt += "[Retrieval]<paragraph>{0}</paragraph>".format(paragraph)
return prompt
query_1 = "Leave odd one out: twitter, instagram, whatsapp."
query_2 = "Can you tell me the difference between llamas and alpacas?"
queries = [query_1, query_2]
preds = model.generate([format_prompt(query) for query in queries], sampling_params)
for pred in preds:
print("Model prediction: {0}".format(pred.outputs[0].text))
prompt = format_prompt("Can you tell me the difference between llamas and alpacas?", paragraph="The alpaca (Lama pacos) is a species of South American camelid mammal. It is similar to, and often confused with, the llama. Alpacas are considerably smaller than llamas, and unlike llamas, they were not bred to be working animals, but were bred specifically for their fiber.")
preds = model.generate([prompt], sampling_params)
print([pred.outputs[0].text for pred in preds])
Advanced Usage
To run our full inference pipeline with a retrieval system and fine - grained tree decoding, please use our code.
đ Documentation
Input Format
As described in the format_prompt
function, your input should follow one of the following formats:
### Instruction:\n{instruction}\n\n### Response:\n".format(instruction)
or
### Instruction:\n{instruction}\n\n### Input:\n{input}\n\n### Response:\n"
if you have additional input. You can insert paragraphs anywhere after ### Response:\n
, but make sure to mark paragraphs as paragraph tokens (i.e., <paragraph>{0}</paragraph>
).
Training details
Our training data is available at the HuggingFace dataset selfrag_train_data. For detailed training information, refer to our official repository. We used 8 A100 40GB GPUs for training on the Stability HPC server.
đ License
This project is licensed under the MIT License.
đ§ Technical Details
Self-RAG is trained on instruction - following corpora with interleaved passages and reflection tokens using the standard next - token prediction objective. This allows for efficient and stable learning with fine - grained feedback. At inference, reflection tokens covering diverse aspects of generations are used to sample the best output that aligns with users' preferences.
đ Citation and contact
If you use this model, please cite our work:
@article{asai2023selfrag,
author = {Asai, Akari and Wu, Zeqiu and Wang, Yizhong and Sil, Avirup and Hajishirzi, Hannaneh},
title = {{Self-RAG}: Learning to Retrieve, Generate, and Critique through Self-Reflection},
year = {2023},
journal = { arXiv preprint arXiv:2310.11511 },
URL = {https://arxiv.org/abs/2310.11511}
}