response-quality-classifier-large Open-source Model - Freely Evaluate the Relevance and Specificity of Conversation Messages

Response Quality Classifier Large

Developed by t-bank-ai

This model is used to evaluate the relevance and specificity of the last message in a dialogue, based on the sberbank-ai/ruRoberta-large architecture.

Dialogue System

Transformers

OtherOpen Source License:MIT #Dialogue Quality Assessment #Russian Dialogue Analysis #Relevance Specificity Detection

Downloads 33

Release Time : 5/31/2022

Model Overview

The model was pretrained in an unsupervised manner on a large volume of dialogue data to predict whether the last response comes from a real conversation or was randomly sampled from other dialogues. It was then fine-tuned on manually annotated examples to assess the relevance and specificity of the last message in a dialogue.

Model Features

Dual-Metric Evaluation

Simultaneously evaluates both the relevance and specificity dimensions of dialogue responses

Unsupervised Pretraining

The model was first pretrained in an unsupervised manner on a large volume of dialogue data

Manual Annotation Fine-tuning

Fine-tuned on manually annotated data after pretraining to improve evaluation accuracy

Model Capabilities

Dialogue Quality Assessment

Relevance Scoring

Specificity Scoring

Use Cases

Dialogue Systems

Chatbot Response Evaluation

Assesses the quality and appropriateness of chatbot-generated responses within dialogue contexts

Dialogue Data Analysis

Analyzes quality characteristics of responses in large-scale dialogue datasets

🚀 Response Quality Classifier Large

This classification model is designed to evaluate the relevance and specificity of the last message in a dialogue context.

🚀 Quick Start

This classification model is based on sberbank-ai/ruRoberta-large. It is used to assess the relevance and specificity of the last message within a dialogue context.

✨ Features

Relevance Assessment: Determines whether the last message in the dialogue is relevant to the entire dialogue context.
Specificity Evaluation: Checks if the last message is interesting and promotes the continuation of the dialogue.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained('tinkoff-ai/response-quality-classifier-large')
model = AutoModelForSequenceClassification.from_pretrained('tinkoff-ai/response-quality-classifier-large')
inputs = tokenizer('[CLS]привет[SEP]привет![SEP]как дела?[RESPONSE_TOKEN]норм, у тя как?', max_length=128, add_special_tokens=False, return_tensors='pt')
with torch.inference_mode():
    logits = model(**inputs).logits
    probas = torch.sigmoid(logits)[0].cpu().detach().numpy()
relevance, specificity = probas

Advanced Usage

The basic usage example covers the main functionality. There is no additional advanced usage information provided in the original document.

📚 Documentation

The labels are explained as follows:

relevance: Indicates whether the last message in the dialogue is relevant in the context of the full dialogue.
specificity: Shows if the last message in the dialogue is interesting and promotes the continuation of the dialogue.

It is pretrained on a large corpus of dialog data in an unsupervised manner. The model is trained to predict whether the last response was from a real dialog or randomly selected from another dialog. Then it was finetuned on manually labelled examples (the dataset will be posted soon).

The model was trained with three messages in the context and one response. Each message was tokenized separately with max_length = 32.

The performance of the model on the validation split (the dataset will be posted soon) (with the best thresholds for validation samples) is as follows:

Property	Threshold	F0.5	ROC AUC
Relevance	0.59	0.86	0.83
Specificity	0.61	0.85	0.86

🔧 Technical Details

The model's training process involves unsupervised pretraining on a large dialog corpus and then finetuning on manually labelled examples. The tokenization process uses a max_length of 32 for each message.

📄 License

The model is licensed under the MIT license.

Additional Information

Here are some example dialogs used to demonstrate the model:

Dialog example 1: "[CLS]привет[SEP]привет![SEP]как дела?[RESPONSE_TOKEN]супер, вот только проснулся, у тебя как?"
Dialog example 2: "[CLS]привет[SEP]привет![SEP]как дела?[RESPONSE_TOKEN]норм"
Dialog example 3: "[CLS]привет[SEP]привет![SEP]как дела?[RESPONSE_TOKEN]норм, у тя как?"

You can easily interact with this model through the app.

The work was done during an internship at Tinkoff by egoriyaa, mentored by solemn-leader.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご