Cadet-Tiny开源对话模型 - 超小体积适用于边缘设备轻松推理

首页

Cadet Tiny

由 ToddGoldfarb 开发

Cadet-Tiny是一个基于SODA数据集训练的超小型对话模型，专为边缘设备推理设计，体积仅为Cosmo-3B模型的2%左右。

对话系统

Transformers

英语开源协议:Openrail #边缘设备对话 #超小型模型 #低资源推理

下载量 2,691

发布时间 : 4/7/2023

模型简介

Cadet-Tiny是一个基于t5-small预训练模型微调而成的对话模型，适用于边缘设备（如树莓派）的轻量级对话任务。

模型特点

轻量级设计

专为低资源设备优化，可在仅2GB内存的设备上运行

对话记忆

支持对话历史跟踪和上下文理解

可调参数

提供temperature等可调参数控制生成多样性

模型能力

对话生成

上下文理解

角色扮演对话

使用案例

边缘设备应用

树莓派聊天机器人

在资源受限的设备上部署轻量级对话助手

可在2GB内存设备上流畅运行

教育应用

编程学习助手

帮助学生理解编程概念的对话助手

🚀 Cadet-Tiny 是什么？

受 Allen AI 的 Cosmo-XL 启发，Cadet-Tiny 是一个基于 SODA 数据集训练的 超小型 对话模型。Cadet-Tiny 旨在用于边缘推理（甚至可以在仅有 2GB 内存的树莓派上运行）。

Cadet-Tiny 基于谷歌的 t5-small 预训练模型进行训练，因此，它的大小约为 Cosmo-3B 模型的 2%。

这是我制作的第一个 SEQ2SEQ 自然语言处理模型！我非常激动能在 HuggingFace 上与大家分享它！😊

如果您有任何问题或改进建议，请通过以下邮箱联系我：tcgoldfarb@gmail.com

📦 模型信息

属性	详情
许可证	OpenRAIL
训练数据	allenai/soda
语言	英语
模型类型	对话式

📚 谷歌 Colab 链接

以下是谷歌 Colab 文件的链接，我在其中详细介绍了模型的训练过程以及如何使用 AI2 的 SODA 公共数据集。点击访问

🚀 快速开始

使用以下代码片段开始使用 Cadet-Tiny！

import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import colorful as cf

cf.use_true_colors()
cf.use_style('monokai')
class CadetTinyAgent:
    def __init__(self):
        print(cf.bold | cf.purple("Waking up Cadet-Tiny..."))
        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        self.tokenizer = AutoTokenizer.from_pretrained("t5-small", model_max_length=512)
        self.model = AutoModelForSeq2SeqLM.from_pretrained("ToddGoldfarb/Cadet-Tiny", low_cpu_mem_usage=True).to(self.device)
        self.conversation_history = ""

    def observe(self, observation):
        self.conversation_history = self.conversation_history + observation
        # The number 400 below is just a truncation safety net. It leaves room for 112 input tokens.
        if len(self.conversation_history) > 400:
            self.conversation_history = self.conversation_history[112:]

    def set_input(self, situation_narrative="", role_instruction=""):
        input_text = "dialogue: "

        if situation_narrative != "":
            input_text = input_text + situation_narrative

        if role_instruction != "":
            input_text = input_text + " <SEP> " + role_instruction

        input_text = input_text + " <TURN> " + self.conversation_history

        # Uncomment the line below to see what is fed to the model.
        # print(input_text)

        return input_text

    def generate(self, situation_narrative, role_instruction, user_response):
        user_response = user_response + " <TURN> "
        self.observe(user_response)

        input_text = self.set_input(situation_narrative, role_instruction)

        inputs = self.tokenizer([input_text], return_tensors="pt").to(self.device)
        
        # I encourage you to change the hyperparameters of the model! Start by trying to modify the temperature.
        outputs = self.model.generate(inputs["input_ids"], max_new_tokens=512, temperature=0.75, top_p=.95,
                                      do_sample=True)
        cadet_response = self.tokenizer.decode(outputs[0], skip_special_tokens=True, clean_up_tokenization_spaces=False)
        added_turn = cadet_response + " <TURN> "
        self.observe(added_turn)

        return cadet_response

    def reset_history(self):
        self.conversation_history = []

    def run(self):
        def get_valid_input(prompt, default):
            while True:
                user_input = input(prompt)
                if user_input in ["Y", "N", "y", "n"]:
                    return user_input
                if user_input == "":
                    return default

        while True:
            continue_chat = ""

            # MODIFY THESE STRINGS TO YOUR LIKING :)
            situation_narrative = "Imagine you are Cadet-Tiny talking to ???."
            role_instruction = "You are Cadet-Tiny, and you are talking to ???."

            self.chat(situation_narrative, role_instruction)
            continue_chat = get_valid_input(cf.purple("Start a new conversation with new setup? [Y/N]:"), "Y")
            if continue_chat in ["N", "n"]:
                break

        print(cf.blue("CT: See you!"))

    def chat(self, situation_narrative, role_instruction):
        print(cf.green(
            "Cadet-Tiny is running! Input [RESET] to reset the conversation history and [END] to end the conversation."))
        while True:
            user_input = input("You: ")
            if user_input == "[RESET]":
                self.reset_history()
                print(cf.green("[Conversation history cleared. Chat with Cadet-Tiny!]"))
                continue
            if user_input == "[END]":
                break
            response = self.generate(situation_narrative, role_instruction, user_input)
            print(cf.blue("CT: " + response))


def main():
    print(cf.bold | cf.blue("LOADING MODEL"))

    CadetTiny = CadetTinyAgent()
    CadetTiny.run()


if __name__ == '__main__':
    main()

📄 引用与特别感谢

特别感谢 Hyunwoo Kim 与我讨论使用 SODA 数据集的最佳方法。如果您还没有了解过他们在 SODA、Prosocial-Dialog 或 COSMO 方面的工作，我建议您去看看！同时，也请阅读关于 SODA 的论文！论文信息如下：

@article{kim2022soda,
    title={SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization},
    author={Hyunwoo Kim and Jack Hessel and Liwei Jiang and Peter West and Ximing Lu and Youngjae Yu and Pei Zhou and Ronan Le Bras and Malihe Alikhani and Gunhee Kim and Maarten Sap and Yejin Choi},
    journal={ArXiv},
    year={2022},
    volume={abs/2212.10465}
}