🚀 Luke Japanese Large Sentiment Analysis Model
This model is fine-tuned from Luke-japanese-large-lite and can analyze which of the eight emotions (joy, sadness, anticipation, surprise, anger, fear, disgust, trust) are included in the text.
🚀 Quick Start
This model is a fine - tuned version based on studio - ousia/Luke - japanese - large - lite. It can analyze which emotions (joy, sadness, anticipation, surprise, anger, fear, disgust, or trust) are present in a given text. The model was trained using the wrime dataset (https://huggingface.co/datasets/shunk031/wrime).
✨ Features
- Emotion Analysis: Capable of analyzing eight different emotions in text.
- Fine - Tuned Model: Based on Luke - japanese - large - lite and fine - tuned on the wrime dataset.
📦 Installation
- Update transformers and install sentencepiece, Python, and PyTorch.
💻 Usage Examples
Basic Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification, LukeConfig
import torch
tokenizer = AutoTokenizer.from_pretrained("Mizuiro-sakura/luke-japanese-large-sentiment-analysis-wrime")
config = LukeConfig.from_pretrained('Mizuiro-sakura/luke-japanese-large-sentiment-analysis-wrime', output_hidden_states=True)
model = AutoModelForSequenceClassification.from_pretrained('Mizuiro-sakura/luke-japanese-large-sentiment-analysis-wrime', config=config)
text='すごく楽しかった。また行きたい。'
max_seq_length=512
token=tokenizer(text,
truncation=True,
max_length=max_seq_length,
padding="max_length")
output=model(torch.tensor(token['input_ids']).unsqueeze(0), torch.tensor(token['attention_mask']).unsqueeze(0))
max_index=torch.argmax(torch.tensor(output.logits))
if max_index==0:
print('joy、うれしい')
elif max_index==1:
print('sadness、悲しい')
elif max_index==2:
print('anticipation、期待')
elif max_index==3:
print('surprise、驚き')
elif max_index==4:
print('anger、怒り')
elif max_index==5:
print('fear、恐れ')
elif max_index==6:
print('disgust、嫌悪')
elif max_index==7:
print('trust、信頼')
🔧 Technical Details
What is Luke?
LUKE (Language Understanding with Knowledge - based Embeddings) is a new pre - trained contextualized representation of words and entities based on the transformer. LUKE treats words and entities in a given text as independent tokens and outputs contextualized representations of them. It adopts an entity - aware self - attention mechanism, which is an extension of the self - attention mechanism of the transformer and considers the types of tokens (words or entities) when computing attention scores.
LUKE achieves state - of - the - art results on five popular NLP benchmarks including SQuAD v1.1 (extractive question answering), CoNLL - 2003 (named entity recognition), ReCoRD (cloze - style question answering), TACRED (relation classification), and Open Entity (entity typing). luke - japanese is the Japanese version of the knowledge - enhanced pre - trained Transformer model LUKE for words and entities.
📄 License
This project is licensed under the MIT License.
Acknowledgments
I would like to thank Mr. Yamada @ikuyamada and Studio ousia @StudioOusia.
Citation
[1]@inproceedings{yamada2020luke,
title={LUKE: Deep Contextualized Entity Representations with Entity - aware Self - attention},
author={Ikuya Yamada and Akari Asai and Hiroyuki Shindo and Hideaki Takeda and Yuji Matsumoto},
booktitle={EMNLP},
year={2020}
}