Dragon-multiturn開源檢索器 - 處理對話式查詢，用於對話式問答場景

首頁

Dragon Multiturn Query Encoder

由nvidia開發

Dragon-multiturn是一款專為對話式問答場景設計的檢索器，能處理將會話歷史與當前查詢相結合的對話式查詢。

問答系統

Transformers

英語開源協議:其他 #多輪對話檢索 #會話式問答 #雙編碼器架構

下載量 710

發布時間 : 4/30/2024

模型概述

該模型基於Dragon檢索器構建，採用雙編碼器架構，包含查詢編碼器和上下文編碼器，適用於多輪對話場景。

模型特點

多輪對話支持

能夠處理包含會話歷史的對話式查詢，適合多輪問答場景。

高效檢索

在多個基準測試中表現出色，平均top-1和top-5召回率顯著提升。

雙編碼器架構

採用查詢編碼器和上下文編碼器分離的設計，提高檢索效率。

模型能力

對話式查詢處理

多輪對話理解

上下文相關檢索

高效信息匹配

使用案例

客戶服務

社會保障諮詢

處理用戶關於社會保障福利的多輪諮詢對話

準確檢索相關福利政策信息

智能助手

多輪對話系統

為智能助手提供上下文感知的檢索能力

提升對話連貫性和準確性

🚀 Dragon-multiturn：多輪對話檢索器

Dragon-multiturn 是專門為對話式問答場景設計的檢索器，能夠處理結合對話歷史與當前查詢的對話式查詢。它基於 Dragon 檢索器構建，可有效提升多輪對話問答的檢索效果。

🚀 快速開始

模型簡介

我們推出了 Dragon-multiturn，這是一款專門為對話式問答場景設計的檢索器。它能夠處理將對話歷史與當前查詢相結合的對話式查詢，基於 Dragon 檢索器構建。Dragon-multiturn 的詳細信息可在此處找到。請注意，Dragon-multiturn 是一個雙編碼器，由查詢編碼器和上下文編碼器組成。此倉庫僅用於獲取查詢嵌入的 Dragon-multiturn 查詢編碼器，您還需要上下文編碼器來獲取上下文嵌入，可在此處找到。查詢編碼器和上下文編碼器共享相同的分詞器。

其他資源

基準測試結果

模型	平均 top-1	平均 top-5	Doc2Dial top-1	Doc2Dial top-5	QuAC top-1	QuAC top-5	QReCC top-1	QReCC top-5	TopiOCQA top-5*	TopiOCQA top-20*	INSCIT top-5*	INSCIT top-20*
Dragon	46.3	73.1	43.3	75.6	56.8	82.9	46.2	82.0	57.7	78.8	27.5	46.2
Dragon-multiturn	53.0	81.2	48.6	83.5	54.8	83.2	49.6	86.7	64.5	85.2	47.4	67.1

以上是在五個多輪問答數據集（Doc2Dial、QuAC、QReCC、TopiOCQA、INSCIT）上的檢索結果，包含平均 top-1 和 top-5 召回率分數。*由於 TopiOCQA 和 INSCIT 中的平均上下文長度比其他數據集小，我們報告了 top-5 和 top-20，以大致匹配這些數據集中 top-1 和 top-5 的上下文長度。

如何使用

import torch
from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained('nvidia/dragon-multiturn-query-encoder')
query_encoder = AutoModel.from_pretrained('nvidia/dragon-multiturn-query-encoder')
context_encoder = AutoModel.from_pretrained('nvidia/dragon-multiturn-context-encoder')

query = [
    {"role": "user", "content": "I need help planning my Social Security benefits for my survivors."},
    {"role": "agent", "content": "Are you currently planning for your future?"},
    {"role": "user", "content": "Yes, I am."}
]
contexts = [
    "Benefits Planner: Survivors | Planning For Your Survivors \nAs you plan for the future , you'll want to think about what your family would need if you should die now. Social Security can help your family if you have earned enough Social Security credits through your work. You can earn up to four credits each year. In 2019 , for example , you earn one credit for each $1,360 of wages or self - employment income. When you have earned $5,440 , you have earned your four credits for the year. The number of credits needed to provide benefits for your survivors depends on your age when you die. No one needs more than 40 credits 10 years of work to be eligible for any Social Security benefit. But , the younger a person is , the fewer credits they must have for family members to receive survivors benefits. Benefits can be paid to your children and your spouse who is caring for the children even if you don't have the required number of credits. They can get benefits if you have credit for one and one - half years of work 6 credits in the three years just before your death.  For Your Widow Or Widower \nThere are about five million widows and widowers receiving monthly Social Security benefits based on their deceased spouse's earnings record.",
    "Benefits Planner: Retirement \nOther Things to Consider \nWhat Is The Best Age To Start Your Benefits? The answer is that there is no one \" best age \" for everyone and, ultimately, it is your choice. You should make an informed decision about when to apply for benefits based on your individual and family circumstances. Your monthly benefit amount can differ substantially based on the age when you start receiving benefits. If you decide to start benefits : before your full retirement age , your benefit will be smaller but you will receive it for a longer period of time. at your full retirement age or later , you will receive a larger monthly benefit for a shorter period of time. The amount you receive when you first get benefits sets the base for the amount you will receive for the rest of your life. You may want to consider the following when you make that decision : If you plan to continue working , there are limits on how much you can earn each year between age 62 and full retirement age and still get all your benefits. Depending on the amount of your benefit and your earnings for the year , you may have to give up some of your benefits."
]

## convert query into a format as follows:
## user: {user}\nagent: {agent}\nuser: {user}
formatted_query = '\n'.join([turn['role'] + ": " + turn['content'] for turn in query]).strip()

## get query and context embeddings
query_input = tokenizer(formatted_query, return_tensors='pt')
ctx_input = tokenizer(contexts, padding=True, truncation=True, max_length=512, return_tensors='pt')
query_emb = query_encoder(**query_input).last_hidden_state[:, 0, :]
ctx_emb = context_encoder(**ctx_input).last_hidden_state[:, 0, :]

## Compute similarity scores using dot product
similarities = query_emb.matmul(ctx_emb.transpose(0, 1)) # (1, num_ctx)

## rank the similarity (from highest to lowest)
ranked_results = torch.argsort(similarities, dim=-1, descending=True) # (1, num_ctx)

多輪問答檢索基準評估

(更新！！) 我們在五個數據集（Doc2Dial、QuAC、QReCC、TopiOCQA 和 INSCIT）上對多輪問答檢索進行了評估，這些數據集可在 ChatRAG Bench 中找到。評估腳本可在此處找到。

許可證

Dragon-multiturn 基於 Dragon 構建。我們建議用戶參考 Dragon 模型的原始許可證。Dragon-multiturn 也受使用條款的約束。

聯繫方式

劉梓涵 (zihanl@nvidia.com)
平偉 (wping@nvidia.com)

引用

@article{liu2024chatqa,
  title={ChatQA: Surpassing GPT-4 on Conversational QA and RAG},
  author={Liu, Zihan and Ping, Wei and Roy, Rajarshi and Xu, Peng and Lee, Chankyu and Shoeybi, Mohammad and Catanzaro, Bryan},
  journal={arXiv preprint arXiv:2401.10225},
  year={2024}
}