Convosensegenerator
C
Convosensegenerator
由 sefinch 开发
ConvoSenseGenerator是一款生成式模型,能够为对话上下文生成常识推理,涵盖10种常见社交常识类型。
下载量 32
发布时间 : 1/24/2024
模型简介
该模型基于T5-3b架构,能够为对话历史生成高质量的常识推理,包括情绪反应、动机、原因、后续事件等多种类型。
模型特点
多类型常识推理
支持10种不同类型的常识推理,包括情绪反应、动机、原因分析等
高质量生成
生成的推理被人类评价为具有高合理性、高新颖信息率以及高细节度
多样化生成
通过beam search和多样性惩罚参数支持多样化输出
模型能力
对话上下文理解
常识推理生成
多类型推理输出
多样化结果生成
使用案例
对话系统增强
聊天机器人常识增强
为聊天机器人添加常识推理能力,使其回答更合理
提升对话的自然度和合理性
社交分析
对话情绪分析
分析对话中隐含的情绪和动机
更深入理解对话参与者的心理状态
🚀 ConvoSenseGenerator模型介绍
ConvoSenseGenerator是一个生成式模型,它能够为对话上下文生成常识推理,涵盖了10种常见的社会常识类型,如情感反应、动机、原因、后续事件等!
该模型基于大规模数据集ConvoSense进行训练,此数据集是使用ChatGPT 3.5合成收集的。
ConvoSenseGenerator生成的推理结果在合理性、为相应对话上下文提供新信息的比例以及细节程度方面,都得到了人类的高度评价,优于在之前人工编写数据集上训练的模型。
🚀 快速开始
ConvoSenseGenerator可以根据提供的问题,涵盖以下常识类型:
commonsense_questions = {
"cause": 'What could have caused the last thing said to happen?',
"prerequisities": 'What prerequisites are required for the last thing said to occur?',
"motivation": 'What is an emotion or human drive that motivates Speaker based on what they just said?',
"subsequent": 'What might happen after what Speaker just said?',
"desire": 'What does Speaker want to do next?',
"desire_o": 'What will Listener want to do next based on what Speaker just said?',
"react": 'How is Speaker feeling after what they just said?',
"react_o": 'How does Listener feel because of what Speaker just said?',
"attribute": 'What is a likely characteristic of Speaker based on what they just said?',
"constituents": 'What is a breakdown of the last thing said into a series of required subevents?'
}
根据论文中的实验,ConvoSenseGenerator表现最佳的配置使用以下生成超参数:
generation_config = {
"repetition_penalty": 1.0,
"num_beams": 10,
"num_beam_groups": 10,
"diversity_penalty": 0.5
}
以下是一个简单的代码片段,用于运行ConvoSenseGenerator:
import torch
from transformers import AutoTokenizer, T5ForConditionalGeneration
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained("sefinch/ConvoSenseGenerator")
model = T5ForConditionalGeneration.from_pretrained("sefinch/ConvoSenseGenerator").to(device)
# ConvoSenseGenerator covers these commonsense types, using the provided questions
commonsense_questions = {
"cause": 'What could have caused the last thing said to happen?',
"prerequisities": 'What prerequisites are required for the last thing said to occur?',
"motivation": 'What is an emotion or human drive that motivates Speaker based on what they just said?',
"subsequent": 'What might happen after what Speaker just said?',
"desire": 'What does Speaker want to do next?',
"desire_o": 'What will Listener want to do next based on what Speaker just said?',
"react": 'How is Speaker feeling after what they just said?',
"react_o": 'How does Listener feel because of what Speaker just said?',
"attribute": 'What is a likely characteristic of Speaker based on what they just said?',
"constituents": 'What is a breakdown of the last thing said into a series of required subevents?'
}
def format_input(conversation_history, commonsense_type):
# prefix last turn with Speaker, and alternately prefix each previous turn with either Listener or Speaker
prefixed_turns = list(
reversed(
[
f"{'Speaker' if i % 2 == 0 else 'Listener'}: {u}"
for i, u in enumerate(reversed(conversation_history))
]
)
)
# model expects a maximum of 7 total conversation turns to be given
truncated_turns = prefixed_turns[-7:]
# conversation representation separates the turns with newlines
conversation_string = '\n'.join(truncated_turns)
# format the full input including the commonsense question
input_text = f"provide a reasonable answer to the question based on the dialogue:\n{conversation_string}\n\n[Question] {commonsense_questions[commonsense_type]}\n[Answer]"
return input_text
def generate(conversation_history, commonsense_type):
# convert the input into the expected format to run the model
input_text = format_input(conversation_history, commonsense_type)
# tokenize the input_text
inputs = tokenizer([input_text], return_tensors="pt").to(device)
# get multiple model generations using the best-performing generation configuration (based on experiments detailed in paper)
outputs = model.generate(
inputs["input_ids"],
repetition_penalty=1.0,
num_beams=10,
num_beam_groups=10,
diversity_penalty=0.5,
num_return_sequences=5,
max_new_tokens=400
)
# decode the generated inferences
inferences = tokenizer.batch_decode(outputs, skip_special_tokens=True, clean_up_tokenization_spaces=False)
return inferences
conversation = [
"Hey, I'm trying to convince my parents to get a dog, but they say it's too much work.",
"Well, you could offer to do everything for taking care of it. Have you tried that?",
"But I don't want to have to take the dog out for walks when it is the winter!"
]
inferences = generate(conversation, "cause")
print('\n'.join(inferences))
# Outputs:
# the speaker's fear of the cold and the inconvenience of having to take the dog out in the winter.
# the speaker's preference for indoor activities during winter, such as watching movies or playing video games.
# the speaker's fear of getting sick from taking the dog out in the cold.
# a previous negative experience with taking dogs for walks in the winter.
# the listener's suggestion to offer to help with taking care of the dog, which the speaker may have considered but was not willing to do.
✨ 主要特性
- 能够为对话上下文生成多种类型的常识推理。
- 基于大规模合成数据集ConvoSense进行训练。
- 生成的推理结果在合理性、新信息比例和细节程度上表现出色。
📚 详细文档
模型描述
- 代码仓库:Code
- 相关论文:ConvoSense: Overcoming Monotonous Commonsense Inferences for Conversational AI
- 联系人:Sarah E. Finch
模型训练
ConvoSenseGenerator在我们最近的数据集ConvoSense上进行训练。其骨干模型是 T5-3b。
引用
如果您发现本仓库中的资源有用,请引用我们的工作:
@article{convosense_finch:24,
author = {Finch, Sarah E. and Choi, Jinho D.},
title = "{ConvoSense: Overcoming Monotonous Commonsense Inferences for Conversational AI}",
journal = {Transactions of the Association for Computational Linguistics},
volume = {12},
pages = {467-483},
year = {2024},
month = {05},
issn = {2307-387X},
doi = {10.1162/tacl_a_00659},
url = {https://doi.org/10.1162/tacl\_a\_00659},
eprint = {https://direct.mit.edu/tacl/article-pdf/doi/10.1162/tacl\_a\_00659/2369521/tacl\_a\_00659.pdf},
}
📄 许可证
本项目采用Apache-2.0许可证。
Dialogpt Medium
MIT
DialoGPT 是一个用于多轮对话的大规模预训练对话响应生成模型,在单轮对话图灵测试中表现与人类相当。
对话系统
D
microsoft
267.59k
368
Dialogpt Small
MIT
DialoGPT是一个最先进的大规模预训练的多轮对话响应生成模型,在单轮对话图灵测试下,其生成的响应质量可以与人类响应质量相媲美。
对话系统
D
microsoft
218.89k
123
Blenderbot 400M Distill
Apache-2.0
该模型通过大规模神经模型和精心设计的训练策略,实现了多技能融合的开放域对话能力。
对话系统 英语
B
facebook
203.20k
431
Dialogpt Large
MIT
DialoGPT 是一个针对多轮对话的前沿大规模预训练对话响应生成模型,在单轮对话图灵测试中生成的响应质量与人类回答相当。
对话系统
D
microsoft
49.90k
276
Blenderbot 3B
Apache-2.0
这是一个基于大规模神经网络的开放领域对话模型,能够融合多种对话技能进行自然交流。
对话系统
Transformers 英语

B
facebook
11.92k
150
Blenderbot 90M
Apache-2.0
BlenderBot是一个开放域聊天机器人模型,专注于多轮对话和多种对话技能的融合。
对话系统
Transformers 英语

B
facebook
4,669
3
Cadet Tiny
Openrail
Cadet-Tiny是一个基于SODA数据集训练的超小型对话模型,专为边缘设备推理设计,体积仅为Cosmo-3B模型的2%左右。
对话系统
Transformers 英语

C
ToddGoldfarb
2,691
6
Blenderbot 1B Distill
Apache-2.0
该模型是一个高性能的开放领域聊天机器人,能够融合多项对话技能,如提问、回答、展现知识和同理心等。
对话系统
Transformers 英语

B
facebook
2,413
37
Blenderbot Small 90M
Apache-2.0
这是一个基于大规模神经网络的开放域对话系统,能够进行多轮自然对话并融合多种对话技能。
对话系统 英语
B
facebook
2,407
49
Unieval Dialog
UniEval是针对自然语言生成任务的多维度评估框架,unieval-dialog是其针对对话响应生成任务的预训练评估器。
对话系统
Transformers

U
MingZhong
2,021
4
精选推荐AI模型
Llama 3 Typhoon V1.5x 8b Instruct
专为泰语设计的80亿参数指令模型,性能媲美GPT-3.5-turbo,优化了应用场景、检索增强生成、受限生成和推理任务
大型语言模型
Transformers 支持多种语言

L
scb10x
3,269
16
Cadet Tiny
Openrail
Cadet-Tiny是一个基于SODA数据集训练的超小型对话模型,专为边缘设备推理设计,体积仅为Cosmo-3B模型的2%左右。
对话系统
Transformers 英语

C
ToddGoldfarb
2,691
6
Roberta Base Chinese Extractive Qa
基于RoBERTa架构的中文抽取式问答模型,适用于从给定文本中提取答案的任务。
问答系统 中文
R
uer
2,694
98