🚀 Brendan/refpydst-1p-referredstates-split-v3
This model is designed for sentence similarity tasks, initialized from a pre - trained model and fine - tuned on a specific dataset to serve as an in - context example retriever.
🚀 Quick Start
This model was initialized with sentence-transformers/all-mpnet-base-v2
and then fine - tuned using a 1% few - shot split of the MultiWOZ dataset and a supervised contrastive loss. It is fine - tuned to be used as an in - context example retriever using this few - shot training set, which is provided in the linked repository. More details are available in the repo and the paper linked within. To cite this model, please consult the citation in the linked GitHub repository README.
The remainder of this README is automatically generated from sentence_transformers
and is accurate. Note that this model is not intended as a general - purpose sentence encoder: it expects in - context examples from MultiWOZ to be formatted in a particular way. See the linked repo for details.
This is a sentence - transformers model: It maps sentences & paragraphs to a 768 - dimensional dense vector space and can be used for tasks like clustering or semantic search.
✨ Features
- Initialized from
sentence-transformers/all-mpnet-base-v2
.
- Fine - tuned on a 1% few - shot split of the MultiWOZ dataset.
- Suitable for in - context example retrieval.
- Maps text to a 768 - dimensional vector space.
📦 Installation
Using this model becomes easy when you have sentence - transformers installed:
pip install -U sentence-transformers
💻 Usage Examples
Basic Usage
from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]
model = SentenceTransformer('Brendan/refpydst-1p-referredstates-split-v3')
embeddings = model.encode(sentences)
print(embeddings)
Advanced Usage
Without sentence - transformers, you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling - operation on - top of the contextualized word embeddings.
from transformers import AutoTokenizer, AutoModel
import torch
def mean_pooling(model_output, attention_mask):
token_embeddings = model_output[0]
input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
sentences = ['This is an example sentence', 'Each sentence is converted']
tokenizer = AutoTokenizer.from_pretrained('Brendan/refpydst-1p-referredstates-split-v3')
model = AutoModel.from_pretrained('Brendan/refpydst-1p-referredstates-split-v3')
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
with torch.no_grad():
model_output = model(**encoded_input)
sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
print("Sentence embeddings:")
print(sentence_embeddings)
📚 Documentation
Evaluation Results
For an automated evaluation of this model, see the Sentence Embeddings Benchmark: https://seb.sbert.net
Training
The model was trained with the following parameters:
DataLoader:
torch.utils.data.dataloader.DataLoader
of length 483 with parameters:
{'batch_size': 24, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
Loss:
sentence_transformers.losses.OnlineContrastiveLoss.OnlineContrastiveLoss
Parameters of the fit() - Method:
{
"epochs": 15,
"evaluation_steps": 200,
"evaluator": "refpydst.retriever.code.st_evaluator.RetrievalEvaluator",
"max_grad_norm": 1,
"optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
"optimizer_params": {
"lr": 2e-05
},
"scheduler": "WarmupLinear",
"steps_per_epoch": null,
"warmup_steps": 100,
"weight_decay": 0.01
}
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
)
Citing & Authors
For more information, please refer to the linked GitHub repository and the paper linked within.