Convosensegenerator
C
Convosensegenerator
由sefinch開發
ConvoSenseGenerator是一款生成式模型,能夠為對話上下文生成常識推理,涵蓋10種常見社交常識類型。
下載量 32
發布時間 : 1/24/2024
模型概述
該模型基於T5-3b架構,能夠為對話歷史生成高質量的常識推理,包括情緒反應、動機、原因、後續事件等多種類型。
模型特點
多類型常識推理
支持10種不同類型的常識推理,包括情緒反應、動機、原因分析等
高質量生成
生成的推理被人類評價為具有高合理性、高新穎信息率以及高細節度
多樣化生成
通過beam search和多樣性懲罰參數支持多樣化輸出
模型能力
對話上下文理解
常識推理生成
多類型推理輸出
多樣化結果生成
使用案例
對話系統增強
聊天機器人常識增強
為聊天機器人添加常識推理能力,使其回答更合理
提升對話的自然度和合理性
社交分析
對話情緒分析
分析對話中隱含的情緒和動機
更深入理解對話參與者的心理狀態
🚀 ConvoSenseGenerator模型介紹
ConvoSenseGenerator是一個生成式模型,它能夠為對話上下文生成常識推理,涵蓋了10種常見的社會常識類型,如情感反應、動機、原因、後續事件等!
該模型基於大規模數據集ConvoSense進行訓練,此數據集是使用ChatGPT 3.5合成收集的。
ConvoSenseGenerator生成的推理結果在合理性、為相應對話上下文提供新信息的比例以及細節程度方面,都得到了人類的高度評價,優於在之前人工編寫數據集上訓練的模型。
🚀 快速開始
ConvoSenseGenerator可以根據提供的問題,涵蓋以下常識類型:
commonsense_questions = {
"cause": 'What could have caused the last thing said to happen?',
"prerequisities": 'What prerequisites are required for the last thing said to occur?',
"motivation": 'What is an emotion or human drive that motivates Speaker based on what they just said?',
"subsequent": 'What might happen after what Speaker just said?',
"desire": 'What does Speaker want to do next?',
"desire_o": 'What will Listener want to do next based on what Speaker just said?',
"react": 'How is Speaker feeling after what they just said?',
"react_o": 'How does Listener feel because of what Speaker just said?',
"attribute": 'What is a likely characteristic of Speaker based on what they just said?',
"constituents": 'What is a breakdown of the last thing said into a series of required subevents?'
}
根據論文中的實驗,ConvoSenseGenerator表現最佳的配置使用以下生成超參數:
generation_config = {
"repetition_penalty": 1.0,
"num_beams": 10,
"num_beam_groups": 10,
"diversity_penalty": 0.5
}
以下是一個簡單的代碼片段,用於運行ConvoSenseGenerator:
import torch
from transformers import AutoTokenizer, T5ForConditionalGeneration
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained("sefinch/ConvoSenseGenerator")
model = T5ForConditionalGeneration.from_pretrained("sefinch/ConvoSenseGenerator").to(device)
# ConvoSenseGenerator covers these commonsense types, using the provided questions
commonsense_questions = {
"cause": 'What could have caused the last thing said to happen?',
"prerequisities": 'What prerequisites are required for the last thing said to occur?',
"motivation": 'What is an emotion or human drive that motivates Speaker based on what they just said?',
"subsequent": 'What might happen after what Speaker just said?',
"desire": 'What does Speaker want to do next?',
"desire_o": 'What will Listener want to do next based on what Speaker just said?',
"react": 'How is Speaker feeling after what they just said?',
"react_o": 'How does Listener feel because of what Speaker just said?',
"attribute": 'What is a likely characteristic of Speaker based on what they just said?',
"constituents": 'What is a breakdown of the last thing said into a series of required subevents?'
}
def format_input(conversation_history, commonsense_type):
# prefix last turn with Speaker, and alternately prefix each previous turn with either Listener or Speaker
prefixed_turns = list(
reversed(
[
f"{'Speaker' if i % 2 == 0 else 'Listener'}: {u}"
for i, u in enumerate(reversed(conversation_history))
]
)
)
# model expects a maximum of 7 total conversation turns to be given
truncated_turns = prefixed_turns[-7:]
# conversation representation separates the turns with newlines
conversation_string = '\n'.join(truncated_turns)
# format the full input including the commonsense question
input_text = f"provide a reasonable answer to the question based on the dialogue:\n{conversation_string}\n\n[Question] {commonsense_questions[commonsense_type]}\n[Answer]"
return input_text
def generate(conversation_history, commonsense_type):
# convert the input into the expected format to run the model
input_text = format_input(conversation_history, commonsense_type)
# tokenize the input_text
inputs = tokenizer([input_text], return_tensors="pt").to(device)
# get multiple model generations using the best-performing generation configuration (based on experiments detailed in paper)
outputs = model.generate(
inputs["input_ids"],
repetition_penalty=1.0,
num_beams=10,
num_beam_groups=10,
diversity_penalty=0.5,
num_return_sequences=5,
max_new_tokens=400
)
# decode the generated inferences
inferences = tokenizer.batch_decode(outputs, skip_special_tokens=True, clean_up_tokenization_spaces=False)
return inferences
conversation = [
"Hey, I'm trying to convince my parents to get a dog, but they say it's too much work.",
"Well, you could offer to do everything for taking care of it. Have you tried that?",
"But I don't want to have to take the dog out for walks when it is the winter!"
]
inferences = generate(conversation, "cause")
print('\n'.join(inferences))
# Outputs:
# the speaker's fear of the cold and the inconvenience of having to take the dog out in the winter.
# the speaker's preference for indoor activities during winter, such as watching movies or playing video games.
# the speaker's fear of getting sick from taking the dog out in the cold.
# a previous negative experience with taking dogs for walks in the winter.
# the listener's suggestion to offer to help with taking care of the dog, which the speaker may have considered but was not willing to do.
✨ 主要特性
- 能夠為對話上下文生成多種類型的常識推理。
- 基於大規模合成數據集ConvoSense進行訓練。
- 生成的推理結果在合理性、新信息比例和細節程度上表現出色。
📚 詳細文檔
模型描述
- 代碼倉庫:Code
- 相關論文:ConvoSense: Overcoming Monotonous Commonsense Inferences for Conversational AI
- 聯繫人:Sarah E. Finch
模型訓練
ConvoSenseGenerator在我們最近的數據集ConvoSense上進行訓練。其骨幹模型是 T5-3b。
引用
如果您發現本倉庫中的資源有用,請引用我們的工作:
@article{convosense_finch:24,
author = {Finch, Sarah E. and Choi, Jinho D.},
title = "{ConvoSense: Overcoming Monotonous Commonsense Inferences for Conversational AI}",
journal = {Transactions of the Association for Computational Linguistics},
volume = {12},
pages = {467-483},
year = {2024},
month = {05},
issn = {2307-387X},
doi = {10.1162/tacl_a_00659},
url = {https://doi.org/10.1162/tacl\_a\_00659},
eprint = {https://direct.mit.edu/tacl/article-pdf/doi/10.1162/tacl\_a\_00659/2369521/tacl\_a\_00659.pdf},
}
📄 許可證
本項目採用Apache-2.0許可證。
Dialogpt Medium
MIT
DialoGPT 是一個用於多輪對話的大規模預訓練對話響應生成模型,在單輪對話圖靈測試中表現與人類相當。
對話系統
D
microsoft
267.59k
368
Dialogpt Small
MIT
DialoGPT是一個最先進的大規模預訓練的多輪對話響應生成模型,在單輪對話圖靈測試下,其生成的響應質量可以與人類響應質量相媲美。
對話系統
D
microsoft
218.89k
123
Blenderbot 400M Distill
Apache-2.0
該模型通過大規模神經模型和精心設計的訓練策略,實現了多技能融合的開放域對話能力。
對話系統 英語
B
facebook
203.20k
431
Dialogpt Large
MIT
DialoGPT 是一個針對多輪對話的前沿大規模預訓練對話響應生成模型,在單輪對話圖靈測試中生成的響應質量與人類回答相當。
對話系統
D
microsoft
49.90k
276
Blenderbot 3B
Apache-2.0
這是一個基於大規模神經網絡的開放領域對話模型,能夠融合多種對話技能進行自然交流。
對話系統
Transformers 英語

B
facebook
11.92k
150
Blenderbot 90M
Apache-2.0
BlenderBot是一個開放域聊天機器人模型,專注於多輪對話和多種對話技能的融合。
對話系統
Transformers 英語

B
facebook
4,669
3
Cadet Tiny
Openrail
Cadet-Tiny是一個基於SODA數據集訓練的超小型對話模型,專為邊緣設備推理設計,體積僅為Cosmo-3B模型的2%左右。
對話系統
Transformers 英語

C
ToddGoldfarb
2,691
6
Blenderbot 1B Distill
Apache-2.0
該模型是一個高性能的開放領域聊天機器人,能夠融合多項對話技能,如提問、回答、展現知識和同理心等。
對話系統
Transformers 英語

B
facebook
2,413
37
Blenderbot Small 90M
Apache-2.0
這是一個基於大規模神經網絡的開放域對話系統,能夠進行多輪自然對話並融合多種對話技能。
對話系統 英語
B
facebook
2,407
49
Unieval Dialog
UniEval是針對自然語言生成任務的多維度評估框架,unieval-dialog是其針對對話響應生成任務的預訓練評估器。
對話系統
Transformers

U
MingZhong
2,021
4
精選推薦AI模型
Llama 3 Typhoon V1.5x 8b Instruct
專為泰語設計的80億參數指令模型,性能媲美GPT-3.5-turbo,優化了應用場景、檢索增強生成、受限生成和推理任務
大型語言模型
Transformers 支持多種語言

L
scb10x
3,269
16
Cadet Tiny
Openrail
Cadet-Tiny是一個基於SODA數據集訓練的超小型對話模型,專為邊緣設備推理設計,體積僅為Cosmo-3B模型的2%左右。
對話系統
Transformers 英語

C
ToddGoldfarb
2,691
6
Roberta Base Chinese Extractive Qa
基於RoBERTa架構的中文抽取式問答模型,適用於從給定文本中提取答案的任務。
問答系統 中文
R
uer
2,694
98