ConvoSenseGenerator开源模型 - 免费生成对话上下文10种常见社交常识推理

首页

Convosensegenerator

由 sefinch 开发

ConvoSenseGenerator是一款生成式模型，能够为对话上下文生成常识推理，涵盖10种常见社交常识类型。

对话系统

Transformers

英语开源协议:Apache-2.0 #多类型常识推理 #对话上下文理解 #高多样性生成

下载量 32

发布时间 : 1/24/2024

模型简介

该模型基于T5-3b架构，能够为对话历史生成高质量的常识推理，包括情绪反应、动机、原因、后续事件等多种类型。

模型特点

多类型常识推理

支持10种不同类型的常识推理，包括情绪反应、动机、原因分析等

高质量生成

生成的推理被人类评价为具有高合理性、高新颖信息率以及高细节度

多样化生成

通过beam search和多样性惩罚参数支持多样化输出

模型能力

对话上下文理解

常识推理生成

多类型推理输出

多样化结果生成

使用案例

对话系统增强

聊天机器人常识增强

为聊天机器人添加常识推理能力，使其回答更合理

提升对话的自然度和合理性

社交分析

对话情绪分析

分析对话中隐含的情绪和动机

更深入理解对话参与者的心理状态

🚀 ConvoSenseGenerator模型介绍

ConvoSenseGenerator是一个生成式模型，它能够为对话上下文生成常识推理，涵盖了10种常见的社会常识类型，如情感反应、动机、原因、后续事件等！

该模型基于大规模数据集ConvoSense进行训练，此数据集是使用ChatGPT 3.5合成收集的。

ConvoSenseGenerator生成的推理结果在合理性、为相应对话上下文提供新信息的比例以及细节程度方面，都得到了人类的高度评价，优于在之前人工编写数据集上训练的模型。

🚀 快速开始

ConvoSenseGenerator可以根据提供的问题，涵盖以下常识类型：

commonsense_questions = {
    "cause": 'What could have caused the last thing said to happen?', 
    "prerequisities": 'What prerequisites are required for the last thing said to occur?', 
    "motivation": 'What is an emotion or human drive that motivates Speaker based on what they just said?', 
    "subsequent": 'What might happen after what Speaker just said?', 
    "desire": 'What does Speaker want to do next?',
    "desire_o": 'What will Listener want to do next based on what Speaker just said?',
    "react": 'How is Speaker feeling after what they just said?',
    "react_o": 'How does Listener feel because of what Speaker just said?',
    "attribute": 'What is a likely characteristic of Speaker based on what they just said?',
    "constituents": 'What is a breakdown of the last thing said into a series of required subevents?' 
}

根据论文中的实验，ConvoSenseGenerator表现最佳的配置使用以下生成超参数：

generation_config = {
    "repetition_penalty": 1.0,
    "num_beams": 10,
    "num_beam_groups": 10,
    "diversity_penalty": 0.5
}

以下是一个简单的代码片段，用于运行ConvoSenseGenerator：

import torch
from transformers import AutoTokenizer, T5ForConditionalGeneration

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained("sefinch/ConvoSenseGenerator")
model = T5ForConditionalGeneration.from_pretrained("sefinch/ConvoSenseGenerator").to(device)

# ConvoSenseGenerator covers these commonsense types, using the provided questions
commonsense_questions = {
    "cause": 'What could have caused the last thing said to happen?', 
    "prerequisities": 'What prerequisites are required for the last thing said to occur?', 
    "motivation": 'What is an emotion or human drive that motivates Speaker based on what they just said?', 
    "subsequent": 'What might happen after what Speaker just said?', 
    "desire": 'What does Speaker want to do next?',
    "desire_o": 'What will Listener want to do next based on what Speaker just said?',
    "react": 'How is Speaker feeling after what they just said?',
    "react_o": 'How does Listener feel because of what Speaker just said?',
    "attribute": 'What is a likely characteristic of Speaker based on what they just said?',
    "constituents": 'What is a breakdown of the last thing said into a series of required subevents?' 
}

def format_input(conversation_history, commonsense_type):

    # prefix last turn with Speaker, and alternately prefix each previous turn with either Listener or Speaker
    prefixed_turns = list(
        reversed(
            [
                f"{'Speaker' if i % 2 == 0 else 'Listener'}: {u}"
                for i, u in enumerate(reversed(conversation_history))
            ]
        )
    )

    # model expects a maximum of 7 total conversation turns to be given
    truncated_turns = prefixed_turns[-7:]

    # conversation representation separates the turns with newlines
    conversation_string = '\n'.join(truncated_turns)

    # format the full input including the commonsense question
    input_text = f"provide a reasonable answer to the question based on the dialogue:\n{conversation_string}\n\n[Question] {commonsense_questions[commonsense_type]}\n[Answer]"

    return input_text

def generate(conversation_history, commonsense_type):
    # convert the input into the expected format to run the model
    input_text = format_input(conversation_history, commonsense_type) 

    # tokenize the input_text
    inputs = tokenizer([input_text], return_tensors="pt").to(device)

    # get multiple model generations using the best-performing generation configuration (based on experiments detailed in paper)
    outputs = model.generate(
        inputs["input_ids"],
        repetition_penalty=1.0,
        num_beams=10,
        num_beam_groups=10,
        diversity_penalty=0.5,
        num_return_sequences=5,
        max_new_tokens=400
    )

    # decode the generated inferences
    inferences = tokenizer.batch_decode(outputs, skip_special_tokens=True, clean_up_tokenization_spaces=False)

    return inferences

conversation = [
    "Hey, I'm trying to convince my parents to get a dog, but they say it's too much work.",
    "Well, you could offer to do everything for taking care of it. Have you tried that?",
    "But I don't want to have to take the dog out for walks when it is the winter!"
]

inferences = generate(conversation, "cause")
print('\n'.join(inferences))

# Outputs:
# the speaker's fear of the cold and the inconvenience of having to take the dog out in the winter.
# the speaker's preference for indoor activities during winter, such as watching movies or playing video games.
# the speaker's fear of getting sick from taking the dog out in the cold.
# a previous negative experience with taking dogs for walks in the winter.
# the listener's suggestion to offer to help with taking care of the dog, which the speaker may have considered but was not willing to do.

✨ 主要特性

能够为对话上下文生成多种类型的常识推理。
基于大规模合成数据集ConvoSense进行训练。
生成的推理结果在合理性、新信息比例和细节程度上表现出色。

📚 详细文档

模型描述

代码仓库：Code
相关论文：ConvoSense: Overcoming Monotonous Commonsense Inferences for Conversational AI
联系人：Sarah E. Finch

模型训练

ConvoSenseGenerator在我们最近的数据集ConvoSense上进行训练。其骨干模型是 T5-3b。

引用

如果您发现本仓库中的资源有用，请引用我们的工作：

@article{convosense_finch:24,
    author = {Finch, Sarah E. and Choi, Jinho D.},
    title = "{ConvoSense: Overcoming Monotonous Commonsense Inferences for Conversational AI}",
    journal = {Transactions of the Association for Computational Linguistics},
    volume = {12},
    pages = {467-483},
    year = {2024},
    month = {05},
    issn = {2307-387X},
    doi = {10.1162/tacl_a_00659},
    url = {https://doi.org/10.1162/tacl\_a\_00659},
    eprint = {https://direct.mit.edu/tacl/article-pdf/doi/10.1162/tacl\_a\_00659/2369521/tacl\_a\_00659.pdf},
}