Polite Bert

P

Polite Bert

Developed by NOVA-vision-language

Polite Bert is a BERT fine-tuned model designed to grade the politeness level of English sentences into four categories: impolite, neutral, somewhat polite, and polite.

Text Classification

EnglishOpen Source License:Apache-2.0 #Politeness Grading #Multi-domain Applicability #BERT Fine-tuning

Downloads 100

Release Time : 5/18/2023

Model Overview

This model automatically assesses the politeness level of text by analyzing tone and word choice, suitable for scenarios like customer service dialogue analysis and social media monitoring.

Model Features

Multi-level Politeness Classification

Capable of distinguishing four levels of politeness, offering finer granularity than traditional binary classification.

Multi-domain Training Data

Trained on diverse datasets including parliamentary transcripts, customer service dialogues, forum discussions, and hotel responses to enhance generalization.

Extreme Speech Identification

Incorporates 4Chan extreme speech data, specifically strengthening the ability to recognize impolite expressions.

Model Capabilities

Text Politeness Analysis

Tone Recognition

Dialogue Quality Assessment

Use Cases

Customer Service

Customer Service Dialogue Quality Monitoring

Automatically evaluates the politeness level of customer service responses to improve service quality.

Social Media

Community Content Moderation

Identifies impolite or offensive remarks to assist in content moderation.

🚀 Polite Bert

A BERT model trained to classify sentences on a scale of politeness.

🚀 Quick Start

The widget below provides some examples to quickly understand how the model works. You can input different sentences to get the politeness classification results.

✨ Features

Politeness Classification: Classify a given sentence into four politeness levels: Not Polite, Neutral, Somewhat Polite, and Polite.
Fine - Tuned BERT: Fine - tune a pre - trained BERT model on annotated politeness - level data.

📦 Installation

The README does not provide specific installation steps, so this section is skipped.

💻 Usage Examples

Basic Usage

You can use the widget on the model page to test the model. Here are some example sentences:

Example 1: "I am good. Just got back from vacation"
Example 2: "I am doing good, I appreciate you asking. I just got back from vacation."
Example 3: "I am doing good, thank you for asking. I just got back from vacation, and loved it."
Example 4: "I am doing good, but why do fucking you care? I just got back from vacation."

📚 Documentation

Model Details

Polite Bert, as the name implies, is a BERT model trained to classify a given sentence on a scale of politeness:

Not Polite (aka Rude or Impolite)
Neutral
Somewhat Polite
Polite

Training

Polite Bert was trained by fine - tuning a BERT model on annotated politeness - level data. The model was trained using SFT for 4 epochs, with a batch size 16, and a max sequence length of 128 tokens.

Data Details

The training data consisted of 2000 annotated sentences. This training data was composed of the following:

Manually annotated data:

250 sentences sampled from EUROPARL dataset. Specifically from the English version of PT - EN data.
250 sentences sampled from SIMMC2.0 dataset. From any domain (Fashion or Furniture) and speaker (System or User).
250 sentences sampled from the Philosophy and Politics data of the StackExchange dataset.
250 sentences sampled from a collection of hotel review replies from Trip Advisor.

Automatically annotated data:

1000 sentences from 4Chan Pol dataset. Specifically, we only considered sentences annotated with TOXICITY > 0.85, SEVERE_TOXICITY > 0.85, and INSULT > 0.5.

While we manually labelled the first 1000 sentences, the 1000 sentences from 4ChanPol were automatically set to Not Polite.

These source datasets were chosen due to their likelihood of containing distinct, but pronounced, politeness levels (hate speech from 4chan, formal and polite speech from hotel staff and parliament members, etc)

🔧 Technical Details

The model is fine - tuned from a pre - trained BERT model. It uses SFT (Supervised Fine - Tuning) for 4 epochs, with a batch size of 16 and a max sequence length of 128 tokens.

📄 License

This project is under the Apache 2.0 license.

Model Image

drawing

Author

Made by Diogo Glória - Silva. PhD Student at NOVA FCT and Affiliated PhD Student at CMU

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase