🚀 naija-twitter-sentiment-afriberta-large
naija-twitter-sentiment-afriberta-large is the first multilingual Twitter sentiment classification model for four Nigerian languages (Hausa, Igbo, Nigerian Pidgin, and Yorùbá). It is based on a fine - tuned castorini/afriberta_large
large model, offering state - of - the - art performance for the Twitter sentiment classification task.
✨ Features
- Supports four Nigerian languages: Hausa, Igbo, Nigerian Pidgin, and Yorùbá.
- Achieves state - of - the - art performance on the Twitter sentiment classification task.
- Classifies tweets into three sentiment classes: negative, neutral, and positive.
📦 Installation
The README doesn't provide specific installation steps, so this section is skipped.
💻 Usage Examples
Basic Usage
from transformers import AutoModelForSequenceClassification
from transformers import AutoTokenizer
import numpy as np
from scipy.special import softmax
MODEL = "Davlan/naija-twitter-sentiment-afriberta-large"
tokenizer = AutoTokenizer.from_pretrained(MODEL)
model = AutoModelForSequenceClassification.from_pretrained(MODEL)
text = "I like you"
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
scores = output[0][0].detach().numpy()
scores = softmax(scores)
id2label = {0:"positive", 1:"neutral", 2:"negative"}
ranking = np.argsort(scores)
ranking = ranking[::-1]
for i in range(scores.shape[0]):
l = id2label[ranking[i]]
s = scores[ranking[i]]
print(f"{i+1}) {l} {np.round(float(s), 4)}")
📚 Documentation
Intended uses & limitations
How to use
You can use this model with Transformers for Sentiment Classification. The above code example shows the basic usage.
Limitations and bias
This model is limited by its training dataset and domain (Twitter). It may not generalize well for all use cases in different domains.
Training procedure
This model was trained on a single Nvidia RTX 2080 GPU with recommended hyperparameters from the original NaijaSenti paper.
Eval results on Test set (F - score), average over 5 runs.
Property |
Details |
Hausa F1 - score |
81.2 |
Igbo F1 - score |
80.8 |
Nigerian Pidgin F1 - score |
74.5 |
Yorùbá F1 - score |
80.4 |
BibTeX entry and citation info
@inproceedings{Muhammad2022NaijaSentiAN,
title={NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis},
author={Shamsuddeen Hassan Muhammad and David Ifeoluwa Adelani and Sebastian Ruder and Ibrahim Said Ahmad and Idris Abdulmumin and Bello Shehu Bello and Monojit Choudhury and Chris C. Emezue and Saheed Salahudeen Abdullahi and Anuoluwapo Aremu and Alipio Jeorge and Pavel B. Brazdil},
year={2022}
}