Refpydst-1p-ReferredStates-Split-V3 Open-Source Model - Efficient for Few-Shot Contextual Example Retrieval on MultiWOZ Data

Refpydst 1p Referredstates Split V3

Developed by Brendan

A sentence transformer model initialized from sentence-transformers/all-mpnet-base-v2, specifically designed for few-shot contextual example retrieval in the MultiWOZ dataset

Text Embedding

Transformers

#Multi-turn Dialogue State Tracking #Few-shot Learning #Contextual Example Retrieval

Downloads 13

Release Time : 6/19/2023

Model Overview

This model is fine-tuned using a 1% few-shot split of the MultiWOZ dataset with supervised contrastive loss, primarily for contextual example retrieval tasks in dialogue state tracking

Model Features

Few-shot Optimization

Specifically fine-tuned for 1% few-shot data, suitable for resource-constrained scenarios

Domain-specific Optimization

Optimized for the MultiWOZ dialogue dataset, particularly suitable for dialogue state tracking tasks

Contrastive Learning Training

Fine-tuned using supervised contrastive loss to enhance semantic similarity judgment capabilities

Model Capabilities

Sentence vectorization

Semantic similarity computation

Contextual example retrieval

Use Cases

Dialogue Systems

Few-shot Dialogue State Tracking

Assists in building dialogue state tracking systems with limited annotated data

Effectively retrieves relevant contextual examples to improve tracking accuracy

Semantic Retrieval

Domain-specific Semantic Search

High-precision semantic similarity search within the MultiWOZ dialogue domain

🚀 Brendan/refpydst-1p-referredstates-split-v3

This model is designed for sentence similarity tasks, initialized from a pre - trained model and fine - tuned on a specific dataset to serve as an in - context example retriever.

🚀 Quick Start

This model was initialized with sentence-transformers/all-mpnet-base-v2 and then fine - tuned using a 1% few - shot split of the MultiWOZ dataset and a supervised contrastive loss. It is fine - tuned to be used as an in - context example retriever using this few - shot training set, which is provided in the linked repository. More details are available in the repo and the paper linked within. To cite this model, please consult the citation in the linked GitHub repository README.

The remainder of this README is automatically generated from sentence_transformers and is accurate. Note that this model is not intended as a general - purpose sentence encoder: it expects in - context examples from MultiWOZ to be formatted in a particular way. See the linked repo for details.

This is a sentence - transformers model: It maps sentences & paragraphs to a 768 - dimensional dense vector space and can be used for tasks like clustering or semantic search.

✨ Features

Initialized from sentence-transformers/all-mpnet-base-v2.
Fine - tuned on a 1% few - shot split of the MultiWOZ dataset.
Suitable for in - context example retrieval.
Maps text to a 768 - dimensional vector space.

📦 Installation

Using this model becomes easy when you have sentence - transformers installed:

pip install -U sentence-transformers

💻 Usage Examples

Basic Usage

from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]

model = SentenceTransformer('Brendan/refpydst-1p-referredstates-split-v3')
embeddings = model.encode(sentences)
print(embeddings)

Advanced Usage

Without sentence - transformers, you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling - operation on - top of the contextualized word embeddings.

from transformers import AutoTokenizer, AutoModel
import torch


#Mean Pooling - Take attention mask into account for correct averaging
def mean_pooling(model_output, attention_mask):
    token_embeddings = model_output[0] #First element of model_output contains all token embeddings
    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)


# Sentences we want sentence embeddings for
sentences = ['This is an example sentence', 'Each sentence is converted']

# Load model from HuggingFace Hub
tokenizer = AutoTokenizer.from_pretrained('Brendan/refpydst-1p-referredstates-split-v3')
model = AutoModel.from_pretrained('Brendan/refpydst-1p-referredstates-split-v3')

# Tokenize sentences
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')

# Compute token embeddings
with torch.no_grad():
    model_output = model(**encoded_input)

# Perform pooling. In this case, mean pooling.
sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])

print("Sentence embeddings:")
print(sentence_embeddings)

📚 Documentation

Evaluation Results

For an automated evaluation of this model, see the Sentence Embeddings Benchmark: https://seb.sbert.net

Training

The model was trained with the following parameters:

DataLoader: torch.utils.data.dataloader.DataLoader of length 483 with parameters:

{'batch_size': 24, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}

Loss: sentence_transformers.losses.OnlineContrastiveLoss.OnlineContrastiveLoss

Parameters of the fit() - Method:

{
    "epochs": 15,
    "evaluation_steps": 200,
    "evaluator": "refpydst.retriever.code.st_evaluator.RetrievalEvaluator",
    "max_grad_norm": 1,
    "optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
    "optimizer_params": {
        "lr": 2e-05
    },
    "scheduler": "WarmupLinear",
    "steps_per_epoch": null,
    "warmup_steps": 100,
    "weight_decay": 0.01
}

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
)

Citing & Authors

For more information, please refer to the linked GitHub repository and the paper linked within.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご