Cadet-Tiny Open-Source Dialogue Model - Ultra-Small Size for Easy Inference on Edge Devices

Cadet Tiny

Developed by ToddGoldfarb

Cadet-Tiny is an ultra-compact dialogue model trained on the SODA dataset, specifically designed for edge device inference, with a size of only about 2% of the Cosmo-3B model.

Dialogue System

Transformers

EnglishOpen Source License:Openrail #Edge Device Dialogue #Ultra-Compact Model #Low-Resource Inference

Downloads 2,691

Release Time : 4/7/2023

Model Overview

Cadet-Tiny is a dialogue model fine-tuned from the t5-small pre-trained model, suitable for lightweight dialogue tasks on edge devices (such as Raspberry Pi).

Model Features

Lightweight Design

Optimized for low-resource devices, can run on devices with as little as 2GB of memory

Dialogue Memory

Supports dialogue history tracking and context understanding

Adjustable Parameters

Provides adjustable parameters like temperature to control generation diversity

Model Capabilities

Dialogue Generation

Context Understanding

Role-Playing Dialogue

Use Cases

Edge Device Applications

Raspberry Pi Chatbot

Deploy a lightweight dialogue assistant on resource-constrained devices

Can run smoothly on devices with 2GB of memory

Educational Applications

Programming Learning Assistant

A dialogue assistant to help students understand programming concepts

🚀 Cadet-Tiny

Cadet-Tiny is a very small conversational model inspired by Allen AI's Cosmo-XL. It's trained on the SODA dataset and designed for edge inference, even on a device as small as a 2GB RAM Raspberry Pi. Trained off Google's t5-small pretrained model, it's about 2% of the size of the Cosmo-3B model. This is the creator's first SEQ2SEQ NLP Model, and they're excited to share it on HuggingFace!

🚀 Quick Start

Use the following code snippet to start using Cadet-Tiny:

import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import colorful as cf

cf.use_true_colors()
cf.use_style('monokai')
class CadetTinyAgent:
    def __init__(self):
        print(cf.bold | cf.purple("Waking up Cadet-Tiny..."))
        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        self.tokenizer = AutoTokenizer.from_pretrained("t5-small", model_max_length=512)
        self.model = AutoModelForSeq2SeqLM.from_pretrained("ToddGoldfarb/Cadet-Tiny", low_cpu_mem_usage=True).to(self.device)
        self.conversation_history = ""

    def observe(self, observation):
        self.conversation_history = self.conversation_history + observation
        # The number 400 below is just a truncation safety net. It leaves room for 112 input tokens.
        if len(self.conversation_history) > 400:
            self.conversation_history = self.conversation_history[112:]

    def set_input(self, situation_narrative="", role_instruction=""):
        input_text = "dialogue: "

        if situation_narrative != "":
            input_text = input_text + situation_narrative

        if role_instruction != "":
            input_text = input_text + " <SEP> " + role_instruction

        input_text = input_text + " <TURN> " + self.conversation_history

        # Uncomment the line below to see what is fed to the model.
        # print(input_text)

        return input_text

    def generate(self, situation_narrative, role_instruction, user_response):
        user_response = user_response + " <TURN> "
        self.observe(user_response)

        input_text = self.set_input(situation_narrative, role_instruction)

        inputs = self.tokenizer([input_text], return_tensors="pt").to(self.device)
        
        # I encourage you to change the hyperparameters of the model! Start by trying to modify the temperature.
        outputs = self.model.generate(inputs["input_ids"], max_new_tokens=512, temperature=0.75, top_p=.95,
                                      do_sample=True)
        cadet_response = self.tokenizer.decode(outputs[0], skip_special_tokens=True, clean_up_tokenization_spaces=False)
        added_turn = cadet_response + " <TURN> "
        self.observe(added_turn)

        return cadet_response

    def reset_history(self):
        self.conversation_history = []

    def run(self):
        def get_valid_input(prompt, default):
            while True:
                user_input = input(prompt)
                if user_input in ["Y", "N", "y", "n"]:
                    return user_input
                if user_input == "":
                    return default

        while True:
            continue_chat = ""

            # MODIFY THESE STRINGS TO YOUR LIKING :)
            situation_narrative = "Imagine you are Cadet-Tiny talking to ???."
            role_instruction = "You are Cadet-Tiny, and you are talking to ???."

            self.chat(situation_narrative, role_instruction)
            continue_chat = get_valid_input(cf.purple("Start a new conversation with new setup? [Y/N]:"), "Y")
            if continue_chat in ["N", "n"]:
                break

        print(cf.blue("CT: See you!"))

    def chat(self, situation_narrative, role_instruction):
        print(cf.green(
            "Cadet-Tiny is running! Input [RESET] to reset the conversation history and [END] to end the conversation."))
        while True:
            user_input = input("You: ")
            if user_input == "[RESET]":
                self.reset_history()
                print(cf.green("[Conversation history cleared. Chat with Cadet-Tiny!]"))
                continue
            if user_input == "[END]":
                break
            response = self.generate(situation_narrative, role_instruction, user_input)
            print(cf.blue("CT: " + response))


def main():
    print(cf.bold | cf.blue("LOADING MODEL"))

    CadetTiny = CadetTinyAgent()
    CadetTiny.run()


if __name__ == '__main__':
    main()

📚 Documentation

Google Colab Link

Here is the link to the Google Colab file, where the process of training the model and using the SODA public dataset from AI2 is walked through: Google Colab File

📄 License

The model is under the OpenRAIL license.

📖 Citations and Special Thanks

Special thanks to Hyunwoo Kim for discussing the best way to use the SODA dataset. If you haven't explored their work on SODA, Prosocial-Dialog, or COSMO, it's recommended. Also, read the paper on SODA!

@article{kim2022soda,
    title={SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization},
    author={Hyunwoo Kim and Jack Hessel and Liwei Jiang and Peter West and Ximing Lu and Youngjae Yu and Pei Zhou and Ronan Le Bras and Malihe Alikhani and Gunhee Kim and Maarten Sap and Yejin Choi},
    journal={ArXiv},
    year={2022},
    volume={abs/2212.10465}
}

If you have any questions or comments on improvements, please contact the author at: tcgoldfarb@gmail.com

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご