MeaningBERT Open-source Model - A Free Automated Tool for Evaluating the Semantic Preservation Degree between Sentences

Home

Meaningbert

Developed by davebulaval

An automated trainable metric for evaluating semantic preservation between sentences

Text Embedding

Transformers

Downloads 785

Release Time : 11/14/2023

Model Overview

MeaningBERT is a BERT-based model specifically designed to evaluate the degree of semantic preservation between two sentences. Its design goal is to provide an automated semantic evaluation metric highly correlated with human judgments, suitable for quality assessment in scenarios such as text simplification and paraphrasing.

Model Features

Semantic Preservation Evaluation

Specifically designed to quantitatively assess the degree of semantic preservation between two sentences

High Correlation with Human Judgment

Model outputs are highly consistent with human subjective judgments on semantic preservation

Automated Rationality Verification

Built-in automated testing framework for identical and unrelated sentences

Improved Training Scheme

Utilizes 500 training epochs and more robust data augmentation techniques

Model Capabilities

Sentence Semantic Similarity Evaluation

Text Simplification Quality Assessment

Paraphrased Text Quality Evaluation

Automated Semantic Preservation Testing

Use Cases

Text Processing Quality Assessment

Text Simplification Evaluation

Assess the degree of semantic preservation between simplified text and the original

Highly correlated with human evaluation results

Paraphrasing Quality Detection

Detect whether paraphrased text maintains the core semantics of the original sentence

Effectively identifies semantic deviations

Educational Technology

Language Learning Assistance

Evaluate semantic preservation when learners paraphrase sentences

Provides objective semantic preservation scores

🚀 MeaningBERT

MeaningBERT is an automatic and trainable metric designed to assess the meaning preservation between sentences. It aims to provide evaluations that highly correlate with human judgments and pass sanity checks. For more details, refer to our publicly available article.

🚀 Quick Start

MeaningBERT can be used in multiple ways. You can either use it as a model for retraining or inference, or as a metric for evaluation.

Use as a Model

# Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("davebulaval/MeaningBERT")
model = AutoModelForSequenceClassification.from_pretrained("davebulaval/MeaningBERT")

Use as a Metric

import torch

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("davebulaval/MeaningBERT")
scorer = AutoModelForSequenceClassification.from_pretrained("davebulaval/MeaningBERT")
scorer.eval()

documents = ["He wanted to make them pay.", "This sandwich looks delicious.", "He wants to eat."]
simplifications = ["He wanted to make them pay.", "This sandwich looks delicious.",
                   "Whatever, whenever, this is a sentence."]

# We tokenize the text as a pair and return Pytorch Tensors
tokenize_text = tokenizer(documents, simplifications, truncation=True, padding=True, return_tensors="pt")

with torch.no_grad():
    # We process the text
    scores = scorer(**tokenize_text)

print(scores.logits.tolist())

Use HuggingFace Metric Module

import evaluate

documents = ["He wanted to make them pay.", "This sandwich looks delicious.", "He wants to eat."]
simplifications = ["He wanted to make them pay.", "This sandwich looks delicious.",
                   "Whatever, whenever, this is a sentence."]

meaning_bert = evaluate.load("davebulaval/meaningbert")

print(meaning_bert.compute(references=documents, predictions=simplifications))

✨ Features

Objective Evaluation: MeaningBERT provides an objective way to assess meaning preservation between sentences, reducing the subjectivity associated with human judgment.
Automated Sanity Checks: It includes two automated tests to ensure the metric's reliability and performance.
Flexible Usage: Can be used as a model for retraining or inference, or as a metric for evaluation.

🔧 Technical Details

Sanity Check

Correlation to human judgment is a common way to evaluate meaning preservation metrics. However, it is subjective and expensive. As an alternative, we designed two automated tests:

Identical Sentences

This test evaluates meaning preservation between identical sentences. We calculate the ratio of times the metric rating is greater or equal to a threshold value X∈[95, 99] to the total number of sentences. To account for computer floating - point inaccuracy, we round the ratings to the nearest integer and do not use a threshold value of 100%.

Unrelated Sentences

This test evaluates meaning preservation between a source sentence and an unrelated sentence generated by a large language model. We check that the metric rating is lower or equal to a threshold value X∈[5, 1]. Again, we round the ratings to the nearest integer to account for floating - point inaccuracy and do not use a threshold value of 0%.

📚 Documentation

📄 License

MeaningBERT is MIT licensed, as found in the LICENSE file.

🤝 Contributing

We welcome user input, whether it regards bugs found in the library or feature propositions! Make sure to have a look at our contributing guidelines for more details on this matter.

📖 Cite

Use the following citation to cite MeaningBERT:

@ARTICLE{10.3389/frai.2023.1223924,
AUTHOR={Beauchemin, David and Saggion, Horacio and Khoury, Richard},    
TITLE={MeaningBERT: assessing meaning preservation between sentences},      
JOURNAL={Frontiers in Artificial Intelligence},      
VOLUME={6},           
YEAR={2023},      
URL={https://www.frontiersin.org/articles/10.3389/frai.2023.1223924},       
DOI={10.3389/frai.2023.1223924},      
ISSN={2624-8212},   
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご