开源intent-classifier意图分类模型 - 免费部署快速将客户问题归类

首页

Intent Classifier

由 Serj 开发

基于Flan-T5-Base微调的意图分类模型，用于将客户问题归类到预定义类别

文本分类

Transformers

#动态意图分类 #多领域适配 #小样本微调

下载量 364

发布时间 : 4/2/2024

模型简介

该模型通过使用合成数据对T5模型进行微调，能够动态地将客户请求分类到预定义的主题类别中，适用于客户服务场景的意图识别。

模型特点

动态分类

通过将所有类别添加到提示中，实现动态意图分类

多场景适用

支持不同业务场景（如披萨餐厅、在线银行等）的意图分类

小样本微调

在少量样本（每类10-20个）上微调即可获得良好性能

模型能力

客户意图识别

主题分类

文本分类

客户服务自动化

使用案例

客户服务

退款请求处理

自动识别客户关于退款请求的意图

准确分类到'退款请求'类别

订阅管理

识别客户取消或恢复订阅的请求

准确分类到'取消订阅'或'恢复订阅'类别

在线服务

银行服务咨询

分类客户关于在线银行服务的咨询问题

🚀 意图分类模型

本模型可对客户请求进行意图分类，通过微调T5模型，利用包含合成数据的提示，能动态地将客户请求分类到预定义的类别中。

🚀 快速开始

模型使用示例

class IntentClassifier:
    def __init__(self, model_name="serj/intent-classifier", device="cuda"):
        self.model = T5ForConditionalGeneration.from_pretrained(model_name).to(device)
        self.tokenizer = T5Tokenizer.from_pretrained(model_name)
        self.device = device


def build_prompt(text, prompt="", company_name="", company_specific=""):
    if company_name == "Pizza Mia":
        company_specific = "This company is a pizzeria place."
    if company_name == "Online Banking":
        company_specific = "This company is an online banking."

    return f"Company name: {company_name} is doing: {company_specific}\nCustomer: {text}.\nEND MESSAGE\nChoose one topic that matches customer's issue.\n{prompt}\nClass name: "


def predict(self, text, prompt_options, company_name, company_portion) -> str:
    input_text = build_prompt(text, prompt_options, company_name, company_portion)
    # print(input_text)
    # Tokenize the concatenated inp_ut text
    input_ids = self.tokenizer.encode(input_text, return_tensors="pt", max_length=512, truncation=True).to(self.device)

    # Generate the output
    output = self.model.generate(input_ids)

    # Decode the output tokens
    decoded_output = self.tokenizer.decode(output[0], skip_special_tokens=True)

    return decoded_output


m = IntentClassifier("serj/intent-classifier")
print(m.predict("Hey, after recent changes, I want to cancel subscription, please help.",
                "OPTIONS:\n refund\n cancel subscription\n damaged item\n return item\n", "Company",
                "Products and subscriptions"))

提示结构说明

Topic %% Customer: text. END MESSAGE OPTIONS: each class separated by % Choose one topic that matches customer's issue. Class name:

你必须在文本末尾加上句号，否则会得到奇怪的结果，这是模型的训练要求。

✨ 主要特性

本模型通过微调Flan - T5 - Base模型，利用包含合成数据的提示对客户请求进行意图分类，可动态地将客户请求分类到预定义的类别中。

📦 安装指南

文档未提供具体安装步骤，暂不展示。

📚 详细文档

模型详情

模型描述

这是一个🤗 transformers模型的模型卡片，已推送到Hub，此模型卡片是自动生成的。

开发者：Serj Smorodinsky
模型类型：Flan - T5 - Base
语言（NLP）：[待补充更多信息]
许可证：[待补充更多信息]
微调基础模型：Flan - T5 - Base

模型来源

仓库地址：https://github.com/SerjSmor/intent_classification

训练详情

训练数据

训练数据仓库：https://github.com/SerjSmor/intent_classification
未来将添加HF数据集。

训练过程

训练脚本地址：https://github.com/SerjSmor/intent_classification/blob/main/t5_generator_trainer.py
使用HF trainer进行训练：

training_args = TrainingArguments(
    output_dir='./results',
    num_train_epochs=epochs,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir='./logs',
    logging_steps=10,
    evaluation_strategy="epoch"
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
    tokenizer=tokenizer,
    # compute_metrics=compute_metrics
)