ComVE-gpt2 Open Source Model - Free to Deploy, Generate Explanations for Statements Violating Common Sense

Comve Gpt2

Developed by aliosm

A GPT-2 fine-tuned model for common sense verification and explanation, used to generate reasons for statements that violate common sense

Large Language Model EnglishOpen Source License:MIT #Common sense reasoning #Text generation #Semantic interpretation

Downloads 18

Release Time : 3/2/2022

Model Overview

This model is fine-tuned on the ComVE dataset from Semeval2020 Task 4, specifically designed to explain why natural language statements violate common sense.

Model Features

Common sense explanation generation

Capable of automatically generating reasonable explanations for statements that violate common sense

Based on GPT-2 architecture

Utilizes GPT-2's powerful language generation capabilities through fine-tuning

Trained on specialized dataset

Targeted training using the ComVE dataset from Semeval2020 Task 4

Model Capabilities

Text generation

Common sense reasoning

Natural language explanation

Use Cases

Education

Common sense error detection

Identify and explain common sense errors in student essays

Provides automated error explanations

Content moderation

False information identification

Detect and explain common sense contradictions in online content

Assists manual review work

🚀 ComVE-gpt2

A fine-tuned model on the Commonsense Validation and Explanation (ComVE) dataset, capable of generating reasons for statements against commonsense.

🚀 Quick Start

You can use this model directly to generate reasons why the given statement is against commonsense using the generate.sh script.

⚠️ Important Note

Make sure that you are using version 2.4.1 of the transformers package. Newer versions have some issues in text generation, and the model may repeat the last token generated again and again.

✨ Features

Finetuned on the Commonsense Validation and Explanation (ComVE) dataset using a causal language modeling (CLM) objective.
Capable of generating a reason why a given natural language statement is against commonsense.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

You can use the raw model for text generation to generate reasons why natural language statements are against commonsense. For example, use the provided script to generate results:

# Use the generate.sh script
# The actual script execution may need to be adjusted according to your environment
bash generate.sh

📚 Documentation

Model description

This is a fine-tuned model on the Commonsense Validation and Explanation (ComVE) dataset introduced in SemEval2020 Task4 using a causal language modeling (CLM) objective. The model can generate a reason why a given natural language statement is against commonsense.

Intended uses & limitations

Intended uses: You can use the raw model for text generation to generate reasons why natural language statements are against commonsense.
Limitations and bias: The model is biased to negate the entered sentence usually instead of producing a factual reason.

Training data

The model is initialized from the gpt2 model and fine-tuned using the ComVE dataset, which contains 10K against commonsense sentences, each paired with three reference reasons.

Training procedure

Each natural language statement that goes against commonsense is concatenated with its reference reason, using <|continue|> as a separator. Then the model is fine-tuned using the CLM objective. The model is trained on an Nvidia Tesla P100 GPU from the Google Colab platform with a learning rate of 5e-5, 5 epochs, a maximum sequence length of 128, and a batch size of 64.

Eval results

The model achieved 14.0547/13.6534 BLEU scores on the SemEval2020 Task4: Commonsense Validation and Explanation development and testing dataset.

BibTeX entry and citation info

@article{fadel2020justers,
  title={JUSTers at SemEval-2020 Task 4: Evaluating Transformer Models Against Commonsense Validation and Explanation},
  author={Fadel, Ali and Al-Ayyoub, Mahmoud and Cambria, Erik},
  year={2020}
}

🔧 Technical Details

The model is fine-tuned on the ComVE dataset with a CLM objective. It uses the gpt2 model as the base and is trained on specific hardware with defined hyperparameters.

📄 License

This project is licensed under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご