ConvoSenseGenerator開源模型 - 免費生成對話上下文10種常見社交常識推理

首頁

Convosensegenerator

由sefinch開發

ConvoSenseGenerator是一款生成式模型，能夠為對話上下文生成常識推理，涵蓋10種常見社交常識類型。

對話系統

Transformers

英語開源協議:Apache-2.0 #多類型常識推理 #對話上下文理解 #高多樣性生成

下載量 32

發布時間 : 1/24/2024

模型概述

該模型基於T5-3b架構，能夠為對話歷史生成高質量的常識推理，包括情緒反應、動機、原因、後續事件等多種類型。

模型特點

多類型常識推理

支持10種不同類型的常識推理，包括情緒反應、動機、原因分析等

高質量生成

生成的推理被人類評價為具有高合理性、高新穎信息率以及高細節度

多樣化生成

通過beam search和多樣性懲罰參數支持多樣化輸出

模型能力

對話上下文理解

常識推理生成

多類型推理輸出

多樣化結果生成

使用案例

對話系統增強

聊天機器人常識增強

為聊天機器人添加常識推理能力，使其回答更合理

提升對話的自然度和合理性

社交分析

對話情緒分析

分析對話中隱含的情緒和動機

更深入理解對話參與者的心理狀態

🚀 ConvoSenseGenerator模型介紹

ConvoSenseGenerator是一個生成式模型，它能夠為對話上下文生成常識推理，涵蓋了10種常見的社會常識類型，如情感反應、動機、原因、後續事件等！

該模型基於大規模數據集ConvoSense進行訓練，此數據集是使用ChatGPT 3.5合成收集的。

ConvoSenseGenerator生成的推理結果在合理性、為相應對話上下文提供新信息的比例以及細節程度方面，都得到了人類的高度評價，優於在之前人工編寫數據集上訓練的模型。

🚀 快速開始

ConvoSenseGenerator可以根據提供的問題，涵蓋以下常識類型：

commonsense_questions = {
    "cause": 'What could have caused the last thing said to happen?', 
    "prerequisities": 'What prerequisites are required for the last thing said to occur?', 
    "motivation": 'What is an emotion or human drive that motivates Speaker based on what they just said?', 
    "subsequent": 'What might happen after what Speaker just said?', 
    "desire": 'What does Speaker want to do next?',
    "desire_o": 'What will Listener want to do next based on what Speaker just said?',
    "react": 'How is Speaker feeling after what they just said?',
    "react_o": 'How does Listener feel because of what Speaker just said?',
    "attribute": 'What is a likely characteristic of Speaker based on what they just said?',
    "constituents": 'What is a breakdown of the last thing said into a series of required subevents?' 
}

根據論文中的實驗，ConvoSenseGenerator表現最佳的配置使用以下生成超參數：

generation_config = {
    "repetition_penalty": 1.0,
    "num_beams": 10,
    "num_beam_groups": 10,
    "diversity_penalty": 0.5
}

以下是一個簡單的代碼片段，用於運行ConvoSenseGenerator：

import torch
from transformers import AutoTokenizer, T5ForConditionalGeneration

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained("sefinch/ConvoSenseGenerator")
model = T5ForConditionalGeneration.from_pretrained("sefinch/ConvoSenseGenerator").to(device)

# ConvoSenseGenerator covers these commonsense types, using the provided questions
commonsense_questions = {
    "cause": 'What could have caused the last thing said to happen?', 
    "prerequisities": 'What prerequisites are required for the last thing said to occur?', 
    "motivation": 'What is an emotion or human drive that motivates Speaker based on what they just said?', 
    "subsequent": 'What might happen after what Speaker just said?', 
    "desire": 'What does Speaker want to do next?',
    "desire_o": 'What will Listener want to do next based on what Speaker just said?',
    "react": 'How is Speaker feeling after what they just said?',
    "react_o": 'How does Listener feel because of what Speaker just said?',
    "attribute": 'What is a likely characteristic of Speaker based on what they just said?',
    "constituents": 'What is a breakdown of the last thing said into a series of required subevents?' 
}

def format_input(conversation_history, commonsense_type):

    # prefix last turn with Speaker, and alternately prefix each previous turn with either Listener or Speaker
    prefixed_turns = list(
        reversed(
            [
                f"{'Speaker' if i % 2 == 0 else 'Listener'}: {u}"
                for i, u in enumerate(reversed(conversation_history))
            ]
        )
    )

    # model expects a maximum of 7 total conversation turns to be given
    truncated_turns = prefixed_turns[-7:]

    # conversation representation separates the turns with newlines
    conversation_string = '\n'.join(truncated_turns)

    # format the full input including the commonsense question
    input_text = f"provide a reasonable answer to the question based on the dialogue:\n{conversation_string}\n\n[Question] {commonsense_questions[commonsense_type]}\n[Answer]"

    return input_text

def generate(conversation_history, commonsense_type):
    # convert the input into the expected format to run the model
    input_text = format_input(conversation_history, commonsense_type) 

    # tokenize the input_text
    inputs = tokenizer([input_text], return_tensors="pt").to(device)

    # get multiple model generations using the best-performing generation configuration (based on experiments detailed in paper)
    outputs = model.generate(
        inputs["input_ids"],
        repetition_penalty=1.0,
        num_beams=10,
        num_beam_groups=10,
        diversity_penalty=0.5,
        num_return_sequences=5,
        max_new_tokens=400
    )

    # decode the generated inferences
    inferences = tokenizer.batch_decode(outputs, skip_special_tokens=True, clean_up_tokenization_spaces=False)

    return inferences

conversation = [
    "Hey, I'm trying to convince my parents to get a dog, but they say it's too much work.",
    "Well, you could offer to do everything for taking care of it. Have you tried that?",
    "But I don't want to have to take the dog out for walks when it is the winter!"
]

inferences = generate(conversation, "cause")
print('\n'.join(inferences))

# Outputs:
# the speaker's fear of the cold and the inconvenience of having to take the dog out in the winter.
# the speaker's preference for indoor activities during winter, such as watching movies or playing video games.
# the speaker's fear of getting sick from taking the dog out in the cold.
# a previous negative experience with taking dogs for walks in the winter.
# the listener's suggestion to offer to help with taking care of the dog, which the speaker may have considered but was not willing to do.

✨ 主要特性

能夠為對話上下文生成多種類型的常識推理。
基於大規模合成數據集ConvoSense進行訓練。
生成的推理結果在合理性、新信息比例和細節程度上表現出色。

📚 詳細文檔

模型描述

代碼倉庫：Code
相關論文：ConvoSense: Overcoming Monotonous Commonsense Inferences for Conversational AI
聯繫人：Sarah E. Finch

模型訓練

ConvoSenseGenerator在我們最近的數據集ConvoSense上進行訓練。其骨幹模型是 T5-3b。

引用

如果您發現本倉庫中的資源有用，請引用我們的工作：

@article{convosense_finch:24,
    author = {Finch, Sarah E. and Choi, Jinho D.},
    title = "{ConvoSense: Overcoming Monotonous Commonsense Inferences for Conversational AI}",
    journal = {Transactions of the Association for Computational Linguistics},
    volume = {12},
    pages = {467-483},
    year = {2024},
    month = {05},
    issn = {2307-387X},
    doi = {10.1162/tacl_a_00659},
    url = {https://doi.org/10.1162/tacl\_a\_00659},
    eprint = {https://direct.mit.edu/tacl/article-pdf/doi/10.1162/tacl\_a\_00659/2369521/tacl\_a\_00659.pdf},
}