hebEMO_joy Open-source Sentiment Detection Tool - Free Detection of Sentiment Polarity in Hebrew UGC

Hebemo Joy

Developed by avichr

HebEMO is a tool for detecting the sentiment polarity and extracting emotions from modern Hebrew user-generated content (UGC), which is trained on a unique COVID-19-related dataset.

Text Classification

Transformers

#Hebrew sentiment analysis #High-precision F1 - 0.96 #News comment processing

Downloads 125

Release Time : 3/2/2022

Model Overview

HebEMO can identify the sentiment polarity (positive/neutral/negative) and eight basic emotions (anger, disgust, anticipation, fear, joy, sadness, surprise, and trust) of Hebrew texts.

Model Features

High-performance sentiment analysis

Achieved an excellent weighted average F1 score of 0.96 in the sentiment polarity classification task

Multi-emotion recognition

Can recognize eight basic emotions, with F1 scores for all emotions except surprise ranging from 0.78 to 0.97

Specialized dataset

Trained on a unique COVID-19-related Hebrew news comment dataset containing 350,000 sentences

Ease of use

Provides a Hugging Face space demo and a Colab notebook, and supports simple API calls

Model Capabilities

Hebrew text sentiment analysis

Multi-emotion recognition

User-generated content analysis

Sentiment polarity classification

Use Cases

Social media analysis

Sentiment analysis of news comments

Analyze the sentiment tendencies of user comments on news websites

Can accurately identify positive, neutral, and negative emotions in comments

Market research

Product feedback analysis

Analyze Hebrew users' evaluations and feedback on products

Can identify specific emotions expressed by users, such as anger and joy

🚀 HebEMO - Emotion Recognition Model for Modern Hebrew

HebEMO is a powerful tool designed to detect polarity and extract emotions from modern Hebrew User-Generated Content (UGC). It was trained on a unique Covid-19 related dataset that we collected and annotated. The model has achieved remarkable results, with a high performance of weighted average F1-score = 0.96 for polarity classification. In emotion detection, it reached an F1-score ranging from 0.78 to 0.97, except for surprise, where the model had a lower performance (F1 = 0.41). These results are even better than the best-reported performance, even when compared to English language models.

🚀 Quick Start

Emotion Recognition Model

You can access the online model at huggingface spaces or as colab notebook.

# !pip install pyplutchik==0.0.7
# !pip install transformers==4.14.1

!git clone https://github.com/avichaychriqui/HeBERT.git
from HeBERT.src.HebEMO import *
HebEMO_model = HebEMO()

HebEMO_model.hebemo(input_path = 'data/text_example.txt')
# return analyzed pandas.DataFrame  

hebEMO_df = HebEMO_model.hebemo(text='החיים יפים ומאושרים', plot=True)

HebEMO Example

For sentiment classification model (polarity ONLY):

from transformers import AutoTokenizer, AutoModel, pipeline

tokenizer = AutoTokenizer.from_pretrained("avichr/heBERT_sentiment_analysis") #same as 'avichr/heBERT' tokenizer
model = AutoModel.from_pretrained("avichr/heBERT_sentiment_analysis")

# how to use?
sentiment_analysis = pipeline(
    "sentiment-analysis",
    model="avichr/heBERT_sentiment_analysis",
    tokenizer="avichr/heBERT_sentiment_analysis",
    return_all_scores = True
)

sentiment_analysis('אני מתלבט מה לאכול לארוחת צהריים')	
>>>  [[{'label': 'neutral', 'score': 0.9978172183036804},
>>>  {'label': 'positive', 'score': 0.0014792329166084528},
>>>  {'label': 'negative', 'score': 0.0007035882445052266}]]

sentiment_analysis('קפה זה טעים')
>>>  [[{'label': 'neutral', 'score': 0.00047328314394690096},
>>>  {'label': 'possitive', 'score': 0.9994067549705505},
>>>  {'label': 'negetive', 'score': 0.00011996887042187154}]]

sentiment_analysis('אני לא אוהב את העולם')
>>>  [[{'label': 'neutral', 'score': 9.214012970915064e-05}, 
>>>  {'label': 'possitive', 'score': 8.876807987689972e-05}, 
>>>  {'label': 'negetive', 'score': 0.9998190999031067}]]

✨ Features

Emotion Detection: HebEMO can accurately detect eight basic emotions (anger, disgust, anticipation, fear, joy, sadness, surprise, and trust) from modern Hebrew UGC.
Polarity Classification: It can classify the sentiment (positive, negative, or neutral) of the input text.
High Performance: Achieved high F1-scores in both emotion recognition and polarity classification.

📦 Installation

# !pip install pyplutchik==0.0.7
# !pip install transformers==4.14.1

!git clone https://github.com/avichaychriqui/HeBERT.git

💻 Usage Examples

Basic Usage

from HeBERT.src.HebEMO import *
HebEMO_model = HebEMO()

HebEMO_model.hebemo(input_path = 'data/text_example.txt')
# return analyzed pandas.DataFrame  

hebEMO_df = HebEMO_model.hebemo(text='החיים יפים ומאושרים', plot=True)

Advanced Usage

from transformers import AutoTokenizer, AutoModel, pipeline

tokenizer = AutoTokenizer.from_pretrained("avichr/heBERT_sentiment_analysis") #same as 'avichr/heBERT' tokenizer
model = AutoModel.from_pretrained("avichr/heBERT_sentiment_analysis")

# how to use?
sentiment_analysis = pipeline(
    "sentiment-analysis",
    model="avichr/heBERT_sentiment_analysis",
    tokenizer="avichr/heBERT_sentiment_analysis",
    return_all_scores = True
)

sentiment_analysis('אני מתלבט מה לאכול לארוחת צהריים')

📚 Documentation

Emotion UGC Data Description

Our UGC data consists of comments posted on news articles collected from 3 major Israeli news sites between January 2020 and August 2020. The total size of the data is approximately 150 MB, containing over 7 million words and 350K sentences.

Approximately 2000 sentences were annotated by crowd members (3 - 10 annotators per sentence) for overall sentiment (polarity) and eight emotions based on Robert Plutchik's wheel of emotions: anger, disgust, anticipation, fear, joy, sadness, surprise, and trust.

The table below shows the percentage of sentences in which each emotion appeared.

	anger	disgust	expectation	fear	happy	sadness	surprise	trust	sentiment
ratio	0.78	0.83	0.58	0.45	0.12	0.59	0.17	0.11	0.25

Performance

Emotion Recognition

emotion	f1-score	precision	recall
anger	0.96	0.99	0.93
disgust	0.97	0.98	0.96
anticipation	0.82	0.80	0.87
fear	0.79	0.88	0.72
joy	0.90	0.97	0.84
sadness	0.90	0.86	0.94
surprise	0.40	0.44	0.37
trust	0.83	0.86	0.80

The above metrics are for the positive class (meaning, the emotion is reflected in the text).

Sentiment (Polarity) Analysis

	precision	recall	f1-score
neutral	0.83	0.56	0.67
positive	0.96	0.92	0.94
negative	0.97	0.99	0.98
accuracy			0.97
macro avg	0.92	0.82	0.86
weighted avg	0.96	0.97	0.96

The sentiment (polarity) analysis model is also available on AWS! For more information, visit AWS' git.

📄 License

No license information provided in the original README.

📖 Citation

If you used this model, please cite us as follows: Chriqui, A., & Yahav, I. (2021). HeBERT & HebEMO: a Hebrew BERT Model and a Tool for Polarity Analysis and Emotion Recognition. arXiv preprint arXiv:2102.01909.

@article{chriqui2021hebert,
  title={HeBERT \& HebEMO: a Hebrew BERT Model and a Tool for Polarity Analysis and Emotion Recognition},
  author={Chriqui, Avihay and Yahav, Inbal},
  journal={arXiv preprint arXiv:2102.01909},
  year={2021}
}

📞 Contact us

Avichay Chriqui
Inbal yahav
The Coller Semitic Languages AI Lab
Thank you, תודה, شكرا

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご