HebEMO_disgust Open-source Sentiment Detection Tool - Free Detection of Sentiment Polarity in Hebrew Content

Hebemo Disgust

Developed by avichr

HebEMO is a tool for detecting polarity and extracting emotions from modern Hebrew user-generated content, trained on COVID-19 related datasets.

Text Classification

Transformers

#Hebrew Sentiment Analysis #High-Precision Polarity Detection #News Comment Emotion Recognition

Downloads 108

Release Time : 3/2/2022

Model Overview

HebEMO can identify emotional polarity and eight basic emotions (anger, disgust, anticipation, fear, joy, sadness, surprise, and trust) in Hebrew texts, excelling in polarity classification tasks.

Model Features

High-Performance Sentiment Analysis

Achieves a weighted average F1 score of 0.96 in polarity classification tasks.

Multi-Emotion Recognition

Capable of identifying eight basic emotions, including anger, disgust, anticipation, etc.

Hebrew-Optimized

Specifically optimized for modern Hebrew user-generated content.

Large-Scale Training Data

Trained on a COVID-19 related dataset containing 350,000 sentences and 7 million words.

Model Capabilities

Text Sentiment Polarity Analysis

Multi-Emotion Recognition

Hebrew Text Processing

User-Generated Content Analysis

Use Cases

Social Media Analysis

News Comment Sentiment Analysis

Analyze the emotional tendencies of user comments on news websites.

Accurately identifies negative, neutral, and positive comments.

Market Research

Product Feedback Analysis

Analyze Hebrew users' emotional reactions to products.

Identifies specific emotions such as joy and anger.

🚀 HebEMO - Emotion Recognition Model for Modern Hebrew

HebEMO is a tool designed to detect polarity and extract emotions from modern Hebrew User-Generated Content (UGC). It was trained on a unique Covid-19 related dataset that we collected and annotated. The model achieved a high performance in polarity classification with a weighted average F1-score of 0.96. In emotion detection, it reached an F1-score ranging from 0.78 to 0.97, except for the surprise emotion, which the model had difficulty capturing (F1 = 0.41). These results outperform the best-reported performance, even when compared to models in the English language.

✨ Features

Polarity and Emotion Detection: HebEMO can accurately detect the polarity (sentiment) and extract emotions from modern Hebrew UGC.
High Performance: Demonstrates excellent performance in both polarity classification and emotion detection tasks.
Unique Dataset: Trained on a specially collected and annotated Covid-19 related dataset.

📦 Installation

The installation steps are included in the usage examples section. You need to install the following packages:

# !pip install pyplutchik==0.0.7
# !pip install transformers==4.14.1

And clone the repository:

!git clone https://github.com/avichaychriqui/HeBERT.git

💻 Usage Examples

Basic Usage

# Install necessary packages
# !pip install pyplutchik==0.0.7
# !pip install transformers==4.14.1

# Clone the repository
!git clone https://github.com/avichaychriqui/HeBERT.git

# Import the HebEMO class
from HeBERT.src.HebEMO import *

# Initialize the HebEMO model
HebEMO_model = HebEMO()

# Analyze text from a file
HebEMO_model.hebemo(input_path = 'data/text_example.txt')
# return analyzed pandas.DataFrame  

# Analyze text directly
hebEMO_df = HebEMO_model.hebemo(text='החיים יפים ומאושרים', plot=True)

HebEMO Example

Advanced Usage - Sentiment Classification Model (Polarity ONLY)

from transformers import AutoTokenizer, AutoModel, pipeline

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("avichr/heBERT_sentiment_analysis") #same as 'avichr/heBERT' tokenizer
model = AutoModel.from_pretrained("avichr/heBERT_sentiment_analysis")

# Create a sentiment analysis pipeline
sentiment_analysis = pipeline(
    "sentiment-analysis",
    model="avichr/heBERT_sentiment_analysis",
    tokenizer="avichr/heBERT_sentiment_analysis",
    return_all_scores = True
)

# Analyze text
sentiment_analysis('אני מתלבט מה לאכול לארוחת צהריים')	
# Output: [[{'label': 'neutral', 'score': 0.9978172183036804},
#           {'label': 'positive', 'score': 0.0014792329166084528},
#           {'label': 'negative', 'score': 0.0007035882445052266}]]

sentiment_analysis('קפה זה טעים')
# Output: [[{'label': 'neutral', 'score': 0.00047328314394690096},
#           {'label': 'possitive', 'score': 0.9994067549705505},
#           {'label': 'negetive', 'score': 0.00011996887042187154}]]

sentiment_analysis('אני לא אוהב את העולם')
# Output: [[{'label': 'neutral', 'score': 9.214012970915064e-05}, 
#           {'label': 'possitive', 'score': 8.876807987689972e-05}, 
#           {'label': 'negetive', 'score': 0.9998190999031067}]]

📚 Documentation

Emotion UGC Data Description

Our UGC data consists of comments posted on news articles collected from 3 major Israeli news sites between January 2020 and August 2020. The total data size is approximately 150 MB, containing over 7 million words and 350K sentences. Around 2000 sentences were annotated by crowd members (3 - 10 annotators per sentence) for overall sentiment (polarity) and eight emotions: anger, disgust, anticipation, fear, joy, sadness, surprise, and trust. The table below shows the percentage of sentences in which each emotion appeared.

	anger	disgust	expectation	fear	happy	sadness	surprise	trust	sentiment
ratio	0.78	0.83	0.58	0.45	0.12	0.59	0.17	0.11	0.25

Performance

Emotion Recognition

emotion	f1-score	precision	recall
anger	0.96	0.99	0.93
disgust	0.97	0.98	0.96
anticipation	0.82	0.80	0.87
fear	0.79	0.88	0.72
joy	0.90	0.97	0.84
sadness	0.90	0.86	0.94
surprise	0.40	0.44	0.37
trust	0.83	0.86	0.80

The above metrics are for the positive class (meaning, the emotion is reflected in the text).

Sentiment (Polarity) Analysis

	precision	recall	f1-score
neutral	0.83	0.56	0.67
positive	0.96	0.92	0.94
negative	0.97	0.99	0.98
accuracy			0.97
macro avg	0.92	0.82	0.86
weighted avg	0.96	0.97	0.96

The sentiment (polarity) analysis model is also available on AWS! For more information, visit AWS' git

📄 License

No license information is provided in the original document.

Contact Us

Avichay Chriqui
Inbal yahav
The Coller Semitic Languages AI Lab

Thank you, תודה, شكرا

Citation

If you used this model, please cite us as: Chriqui, A., & Yahav, I. (2022). HeBERT & HebEMO: a Hebrew BERT Model and a Tool for Polarity Analysis and Emotion Recognition. INFORMS Journal on Data Science, forthcoming.

@article{chriqui2021hebert,
  title={HeBERT \& HebEMO: a Hebrew BERT Model and a Tool for Polarity Analysis and Emotion Recognition},
  author={Chriqui, Avihay and Yahav, Inbal},
  journal={INFORMS Journal on Data Science},
  year={2022}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご