đ KoELECTRA Fine-tuned for Korean Emotion Classification
This model is fine-tuned from KoELECTRA for Korean emotion classification, capable of classifying six major emotions: anger, happiness, anxiety, embarrassment, sadness, and heartache.
đ Quick Start
This model is fine-tuned from KoELECTRA for Korean emotion classification. It can classify six major emotions: anger, happiness, anxiety, embarrassment, sadness, and heartache.
Base Model: KoELECTRA (Korean ELECTRA)
Task: Multi-class Emotion Classification
Language: Korean
License: MIT
⨠Features
- Capable of classifying six major emotions: anger, happiness, anxiety, embarrassment, sadness, and heartache.
- Can be used in various applications such as social media emotion analysis, customer review analysis, chatbot emotion recognition, content recommendation, music recommendation, and literary analysis.
đĻ Installation
To use this model, you need to install the transformers
library. You can install it using the following command:
pip install transformers
đģ Usage Examples
Basic Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_name = "Jinuuuu/KoELECTRA_fine_tunning_emotion"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
def analyze_emotion(text):
inputs = tokenizer(
text,
return_tensors="pt",
truncation=True,
max_length=512,
padding=True
)
with torch.no_grad():
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=1)
emotion_labels = ['angry', 'anxious', 'embarrassed', 'happy', 'heartache', 'sad']
emotion_probs = {}
for i, label in enumerate(emotion_labels):
emotion_probs[label] = float(probs[0][i])
return emotion_probs
text = "ė¤ëė ė ë§ íëŗĩí íëŖ¨ėë¤."
result = analyze_emotion(text)
print("Emotion analysis result:")
for emotion, prob in sorted(result.items(), key=lambda x: x[1], reverse=True):
print(f"{emotion}: {prob:.3f}")
Advanced Usage
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="Jinuuuu/KoELECTRA_fine_tunning_emotion",
tokenizer="Jinuuuu/KoELECTRA_fine_tunning_emotion"
)
texts = [
"ė¤ëė ė ë§ íëŗĩí íëŖ¨ėë¤.",
"ëëŦ´ íę° ëė ė°¸ė ė ėë¤.",
"ë´ėŧ ėíė´ ęąąė ëë¤."
]
results = classifier(texts)
for text, result in zip(texts, results):
print(f"Text: {text}")
print(f"Emotion: {result['label']} (Probability: {result['score']:.3f})")
print()
đ Documentation
Emotion Labels
The model classifies the following six emotions:
Label |
Korean |
Description |
angry |
ëļë
¸ |
Anger, irritation, annoyance |
happy |
íëŗĩ |
Happiness, joy, satisfaction |
anxious |
ëļė |
Anxiety, worry, fear |
embarrassed |
ëšíŠ |
Embarrassment, confusion, discomfiture |
sad |
ėŦí |
Sadness, depression, disappointment |
heartache |
ėė˛ |
Heartache, betrayal, disappointment |
Model Architecture
- Base Model: KoELECTRA-base
- Model Type: Sequence Classification
- Hidden Size: 768
- Num Attention Heads: 12
- Num Hidden Layers: 12
- Max Sequence Length: 512
- Vocab Size: 35000
- Num Labels: 6
Training Details
Training Data
- Dataset: Custom Korean Emotion Dataset
- Training Samples: ~50,000 sentences
- Validation Samples: ~10,000 sentences
- Data Source: Korean social media posts, reviews, and literature
Training Hyperparameters
- Learning Rate: 2e-5
- Batch Size: 16
- Epochs: 3-5
- Warmup Steps: 500
- Weight Decay: 0.01
- Max Sequence Length: 512
Training Environment
- Framework: PyTorch + Transformers
- Hardware: GPU (CUDA enabled)
- Optimizer: AdamW
Performance
Overall Performance
Metric |
Score |
Accuracy |
0.85+ |
F1-Score (Macro) |
0.83+ |
F1-Score (Weighted) |
0.85+ |
Per-Class Performance
Emotion |
Precision |
Recall |
F1-Score |
angry |
0.87 |
0.84 |
0.85 |
happy |
0.89 |
0.91 |
0.90 |
anxious |
0.82 |
0.79 |
0.80 |
embarrassed |
0.78 |
0.76 |
0.77 |
sad |
0.85 |
0.87 |
0.86 |
heartache |
0.81 |
0.83 |
0.82 |
Applications
This model can be used for the following purposes:
- Social Media Emotion Analysis: Understanding the emotions in posts and comments.
- Customer Review Analysis: Classifying the emotions in product/service reviews.
- Chatbot Emotion Recognition: Understanding the user's emotions in a conversation system.
- Content Recommendation: Recommending content based on emotions.
- Music Recommendation: Recommending music based on text emotions.
- Literary Analysis: Analyzing the emotions in novels, poems, etc.
Limitations
- The model is optimized for Korean text.
- It can process a maximum of 512 tokens.
- The accuracy of emotion classification may vary depending on the context.
- The performance on slang, neologisms, and dialects may be limited.
Bias and Fairness
This model may reflect the biases in the training data. It may show biased results for certain topics or expressions. Therefore, sufficient validation and monitoring are required when applying it to real services.
đ§ Technical Details
The model is fine-tuned from KoELECTRA, a Korean ELECTRA model. It uses a sequence classification architecture to classify six major emotions in Korean text. The training data consists of Korean social media posts, reviews, and literature. The model is trained using PyTorch and the Transformers library with the AdamW optimizer.
đ License
This model is released under the MIT license.
Citation
@misc{koelectra_emotion_2024,
title={KoELECTRA Fine-tuned for Korean Emotion Classification},
author={Jinuuuu},
year={2024},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/Jinuuuu/KoELECTRA_fine_tunning_emotion}}
}
Model Card Authors
- Developer: Jinuuuu
- Model Type: Text Classification
- Language: Korean
- License: MIT
Contact
If you have any questions or suggestions for improvement regarding the model, please contact us through GitHub issues or the Hugging Face model page.
đĄ Usage Tip
This model is developed for research and educational purposes. When using it for commercial purposes, please conduct sufficient verification and testing.