refpydst-5p-referredstates-split-v1 Open-source Model - Help You Efficiently Retrieve Small-sample Context Examples in MultiWOZ

Refpydst 5p Referredstates Split V1

Developed by Brendan

A sentence transformer model initialized from sentence-transformers/all-mpnet-base-v2, specifically designed for few-shot context example retrieval in the MultiWOZ dataset

Text Embedding

Transformers

#Dialogue State Tracking #Few-shot Learning #Semantic Retrieval

Downloads 13

Release Time : 6/19/2023

Model Overview

This model is fine-tuned with a 5% few-shot split of the MultiWOZ dataset using supervised contrastive loss, primarily for context example retrieval tasks in dialogue state tracking

Model Features

Few-shot Optimization

Specifically trained on 5% few-shot data, excelling in few-shot scenarios

Domain-specific Optimization

Optimized for the MultiWOZ dialogue dataset, particularly suitable for dialogue state tracking tasks

Supervised Contrastive Learning

Fine-tuned with supervised contrastive loss to enhance the ability to distinguish between similar sentences

Model Capabilities

Sentence embedding generation

Semantic similarity computation

Context example retrieval

Feature extraction

Use Cases

Dialogue Systems

Dialogue State Tracking

Retrieving relevant context examples in MultiWOZ dialogue systems

Improves state tracking accuracy in few-shot scenarios

Information Retrieval

Similar Dialogue Retrieval

Retrieving semantically similar dialogue segments from conversation history

Supports example-based dialogue system development

🚀 Brendan/refpydst-5p-referredstates-split-v1

This model is designed for sentence similarity tasks. It was initialized with sentence-transformers/all-mpnet-base-v2 and fine - tuned on a 5% few - shot split of the MultiWOZ dataset using supervised contrastive loss. It serves as an in - context example retriever with the few - shot training set provided in the linked repository. More details can be found in the repo and the associated paper. To cite this model, refer to the citation in the linked GitHub repository README.

This README is partially auto - generated from sentence_transformers. Note that this model is not a general - purpose sentence encoder; it expects in - context examples from MultiWOZ to be formatted in a specific way. Check the linked repo for details.

It's a sentence-transformers model that maps sentences and paragraphs to a 768 - dimensional dense vector space, useful for tasks like clustering or semantic search.

🚀 Quick Start

✨ Features

Initialized with sentence-transformers/all-mpnet-base-v2.
Fine - tuned on a 5% few - shot split of the MultiWOZ dataset.
Suitable for in - context example retrieval.

📦 Installation

Using this model becomes easy when you have sentence-transformers installed:

pip install -U sentence-transformers

💻 Usage Examples

Basic Usage

If you have sentence-transformers installed:

from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]

model = SentenceTransformer('Brendan/refpydst-5p-referredstates-split-v1')
embeddings = model.encode(sentences)
print(embeddings)

Advanced Usage

Without sentence-transformers, you can use the model like this:

from transformers import AutoTokenizer, AutoModel
import torch


#Mean Pooling - Take attention mask into account for correct averaging
def mean_pooling(model_output, attention_mask):
    token_embeddings = model_output[0] #First element of model_output contains all token embeddings
    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
    return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)


# Sentences we want sentence embeddings for
sentences = ['This is an example sentence', 'Each sentence is converted']

# Load model from HuggingFace Hub
tokenizer = AutoTokenizer.from_pretrained('Brendan/refpydst-5p-referredstates-split-v1')
model = AutoModel.from_pretrained('Brendan/refpydst-5p-referredstates-split-v1')

# Tokenize sentences
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')

# Compute token embeddings
with torch.no_grad():
    model_output = model(**encoded_input)

# Perform pooling. In this case, mean pooling.
sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])

print("Sentence embeddings:")
print(sentence_embeddings)

📚 Documentation

Evaluation Results

For an automated evaluation of this model, see the Sentence Embeddings Benchmark: https://seb.sbert.net

Training

The model was trained with the following parameters:

DataLoader: torch.utils.data.dataloader.DataLoader of length 2276 with parameters:

{'batch_size': 24, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}

Loss: sentence_transformers.losses.OnlineContrastiveLoss.OnlineContrastiveLoss

Parameters of the fit() - Method:

{
    "epochs": 15,
    "evaluation_steps": 800,
    "evaluator": "refpydst.retriever.code.st_evaluator.RetrievalEvaluator",
    "max_grad_norm": 1,
    "optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
    "optimizer_params": {
        "lr": 2e-05
    },
    "scheduler": "WarmupLinear",
    "steps_per_epoch": null,
    "warmup_steps": 100,
    "weight_decay": 0.01
}

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
)

📄 License

No license information provided in the original document, so this section is skipped.

Citing & Authors

Refer to the citation in the linked GitHub repository README for citing this model.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご