bert-43-multilabel-emotion-detection Open Source Model - Free to Classify English Texts into 43 Emotional Categories

Bert 43 Multilabel Emotion Detection

Developed by borisn70

A multilabel sentiment classification model fine-tuned based on bert-base-uncased, capable of classifying English texts into 43 emotion categories

Text Classification

Transformers

EnglishOpen Source License:MIT #Multilabel Sentiment Classification #High-precision BERT Fine-tuning #Social Media Sentiment Analysis

Downloads 326

Release Time : 4/1/2024

Model Overview

This model can perform multilabel sentiment classification on English texts, identifying 43 different emotion categories, suitable for scenarios such as social media public opinion monitoring and customer feedback analysis.

Model Features

Multilabel Classification

Capable of identifying multiple emotions in text simultaneously, rather than a single emotion

Broad Emotion Coverage

Supports classification into 43 different emotion categories

High Performance

Achieves 92.02% accuracy on the validation set

Comprehensive Training Data

Trained on multiple high-quality sentiment datasets

Model Capabilities

Sentiment Analysis

Emotion Recognition

Feeling Classification

Label Classification

Use Cases

Social Media Analysis

Public Opinion Monitoring

Analyze users' emotional reactions to specific topics on social media

Can identify multiple emotional tendencies to help understand public sentiment

Customer Service

Feedback Analysis

Analyze emotional tendencies in customer feedback

Helps identify customer satisfaction and potential problem areas

🚀 bert-43-multilabel-emotion-detection

This model is a fine - tuned version of "bert - base - uncased", designed to classify the emotional content of English sentences into 43 categories, which can be applied in various text - related emotional analysis scenarios.

🚀 Quick Start

This model, "bert-43-multilabel-emotion-detection", is a fine-tuned version of "bert-base-uncased". It is trained to classify sentences based on their emotional content into one of 43 categories in the English language. You can use it for sentiment analysis, social media monitoring, customer feedback analysis, etc.

from transformers import pipeline

# Load the pre-trained model and tokenizer
model = 'borisn70/bert-43-multilabel-emotion-detection'
tokenizer = 'borisn70/bert-43-multilabel-emotion-detection'

# Create a pipeline for sentiment analysis
nlp = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer)

# Test the model with a sentence
result = nlp("I feel great about this!")

# Print the result
print(result)

✨ Features

Classify English sentences into 43 emotional categories.
Applicable in multiple scenarios such as sentiment analysis, social media monitoring, and customer feedback analysis.

📦 Installation

The provided code uses the transformers library. You can install it using the following command:

pip install transformers

💻 Usage Examples

Basic Usage

from transformers import pipeline

# Load the pre-trained model and tokenizer
model = 'borisn70/bert-43-multilabel-emotion-detection'
tokenizer = 'borisn70/bert-43-multilabel-emotion-detection'

# Create a pipeline for sentiment analysis
nlp = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer)

# Test the model with a sentence
result = nlp("I feel great about this!")

# Print the result
print(result)

📚 Documentation

Model Description

This model, "bert-43-multilabel-emotion-detection", is a fine-tuned version of "bert-base-uncased", trained to classify sentences based on their emotional content into one of 43 categories in the English language. The model was trained on a combination of datasets including tweet_emotions, GoEmotions, and synthetic data, amounting to approximately 271,000 records with around 6,306 records per label.

Intended Use

This model is intended for any application that requires understanding or categorizing the emotional content of English text. This could include sentiment analysis, social media monitoring, customer feedback analysis, and more.

Training Data

The training data comprises the following datasets:

Tweet Emotions
GoEmotions
Synthetic data

Training Procedure

The model was trained over 20 epochs, taking about 6 hours on a Google Colab V100 GPU with 16 GB RAM.

The following settings have been used:

from transformers import TrainingArguments

training_args = TrainingArguments(
    output_dir='results',        
    optim="adamw_torch",
    learning_rate=2e-5,               # learning rate
    num_train_epochs=20,              # total number of training epochs
    per_device_train_batch_size=128,  # batch size per device during training
    per_device_eval_batch_size=128,   # batch size for evaluation
    warmup_steps=500,                 # number of warmup steps for learning rate scheduler
    weight_decay=0.01,                # strength of weight decay
    logging_dir='./logs',             # directory for storing logs
    logging_steps=100,
)

Performance

The model achieved the following performance metrics on the validation set:

Accuracy: 92.02%
Weighted F1-Score: 91.93%
Weighted Precision: 91.88%
Weighted Recall: 92.02%

Performance details for each of the 43 labels.

Labels Mapping

Label ID	Emotion
0	admiration
1	amusement
2	anger
3	annoyance
4	approval
5	caring
6	confusion
7	curiosity
8	desire
9	disappointment
10	disapproval
11	disgust
12	embarrassment
13	excitement
14	fear
15	gratitude
16	grief
17	joy
18	love
19	nervousness
20	optimism
21	pride
22	realization
23	relief
24	remorse
25	sadness
26	surprise
27	neutral
28	worry
29	happiness
30	fun
31	hate
32	autonomy
33	safety
34	understanding
35	empty
36	enthusiasm
37	recreation
38	sense of belonging
39	meaning
40	sustenance
41	creativity
42	boredom

Accuracy Report

Label	Precision	Recall	F1 - Score
0	0.8625	0.7969	0.8284
1	0.9128	0.9558	0.9338
2	0.9028	0.8749	0.8886
3	0.8570	0.8639	0.8605
4	0.8584	0.8449	0.8516
5	0.9343	0.9667	0.9502
6	0.9492	0.9696	0.9593
7	0.9234	0.9462	0.9347
8	0.9644	0.9924	0.9782
9	0.9481	0.9377	0.9428
10	0.9250	0.9267	0.9259
11	0.9653	0.9914	0.9782
12	0.9948	0.9976	0.9962
13	0.9474	0.9676	0.9574
14	0.8926	0.8853	0.8889
15	0.9501	0.9515	0.9508
16	0.9976	0.9990	0.9983
17	0.9114	0.8716	0.8911
18	0.7825	0.7821	0.7823
19	0.9962	0.9990	0.9976
20	0.9516	0.9638	0.9577
21	0.9953	0.9995	0.9974
22	0.9630	0.9791	0.9710
23	0.9134	0.9134	0.9134
24	0.9753	0.9948	0.9849
25	0.7374	0.7469	0.7421
26	0.7864	0.7583	0.7721
27	0.6000	0.5666	0.5828
28	0.7369	0.6836	0.7093
29	0.8066	0.7222	0.7620
30	0.9116	0.9225	0.9170
31	0.9108	0.9524	0.9312
32	0.9611	0.9634	0.9622
33	0.9592	0.9724	0.9657
34	0.9700	0.9686	0.9693
35	0.9459	0.9734	0.9594
36	0.9359	0.9857	0.9601
37	0.9986	0.9986	0.9986
38	0.9943	0.9990	0.9967
39	0.9990	1.0000	0.9995
40	0.9905	0.9914	0.9910
41	0.9981	0.9948	0.9964
42	0.9929	0.9986	0.9957
weighted avg	0.9188	0.9202	0.9193

🔧 Technical Details

The model is based on the fine - tuning of "bert - base - uncased". It uses a combination of multiple datasets for training and specific training settings to achieve good performance in emotional classification.

📄 License

This project is licensed under the MIT license.

⚠️ Important Note

The model's performance can vary significantly across different emotional categories, especially those with less representation in the training data.
Users should be cautious about potential biases in the training data, which may be reflected in the model's predictions.

💡 Usage Tip

If you have any questions, feedback, or would like to report any issues regarding the model, please feel free to reach out.

Email: borisn70@gmail.com
LinkedIn: Boris Atayan

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご