🚀 Re2G中NQ问题编码器的模型卡片
本模型主要用于将问题编码为向量,可作为近似最近邻索引的查询向量,结合上下文编码器对段落进行编码和索引,在信息检索、问答等任务中表现出色。
🚀 快速开始
训练、评估和推理的代码可在我们的GitHub的 re2g分支 中找到。使用该模型的最佳方式是调整 dpr_apply.py。
✨ 主要特性
RAG、Multi - DPR和KGI的方法是训练一个神经信息检索(IR)组件,并通过其在生成正确输出中的影响进行端到端的进一步训练。

📚 详细文档
模型详情
RAG、Multi - DPR和KGI的方法是训练一个神经信息检索(IR)组件,并通过其在生成正确输出中的影响进行端到端的进一步训练。
训练、评估和推理
训练、评估和推理的代码在我们GitHub的 re2g分支 中。
使用方法
使用该模型的最佳方式是调整 dpr_apply.py。
用途
直接使用
该模型可用于将问题编码为向量,作为近似最近邻索引的查询向量。它必须与将段落编码为向量并进行索引的上下文编码器结合使用。
模型描述
模型创建者在 相关论文 中指出:正如GPT - 3和T5所展示的,随着参数空间越来越大,Transformer的能力也在不断增强。然而,对于需要大量知识的任务,非参数内存允许模型以亚线性的计算成本和GPU内存需求大幅增长。最近的模型如RAG和REALM已将检索引入条件生成。这些模型包含从段落语料库进行的神经初始检索。我们在此研究基础上,提出了Re2G,它将神经初始检索和重排序结合到基于BART的序列到序列生成中。我们的重排序方法还允许合并来自分数不可比来源的检索结果,从而实现BM25和神经初始检索的集成。为了端到端地训练我们的系统,我们引入了一种新颖的知识蒸馏变体,仅使用目标序列输出的真实标签来训练初始检索、重排序器和生成器。我们在四个不同的任务中取得了显著的收益:零样本插槽填充、问答、事实核查和对话,在KILT排行榜上相对于之前的最先进技术有9%到34%的相对提升。我们将代码开源。
引用
@inproceedings{glass-etal-2022-re2g,
title = "{R}e2{G}: Retrieve, Rerank, Generate",
author = "Glass, Michael and
Rossiello, Gaetano and
Chowdhury, Md Faisal Mahbub and
Naik, Ankita and
Cai, Pengshan and
Gliozzo, Alfio",
booktitle = "Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies",
month = jul,
year = "2022",
address = "Seattle, United States",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.naacl-main.194",
doi = "10.18653/v1/2022.naacl-main.194",
pages = "2701--2715",
abstract = "As demonstrated by GPT-3 and T5, transformers grow in capability as parameter spaces become larger and larger. However, for tasks that require a large amount of knowledge, non-parametric memory allows models to grow dramatically with a sub-linear increase in computational cost and GPU memory requirements. Recent models such as RAG and REALM have introduced retrieval into conditional generation. These models incorporate neural initial retrieval from a corpus of passages. We build on this line of research, proposing Re2G, which combines both neural initial retrieval and reranking into a BART-based sequence-to-sequence generation. Our reranking approach also permits merging retrieval results from sources with incomparable scores, enabling an ensemble of BM25 and neural initial retrieval. To train our system end-to-end, we introduce a novel variation of knowledge distillation to train the initial retrieval, reranker and generation using only ground truth on the target sequence output. We find large gains in four diverse tasks: zero-shot slot filling, question answering, fact checking and dialog, with relative gains of 9{\%} to 34{\%} over the previous state-of-the-art on the KILT leaderboard. We make our code available as open source.",
}
📄 许可证
本模型使用的许可证为Apache 2.0。