đ Sentiment Classification for hinglish text: gk-hinglish-sentiment
This project focuses on sentiment classification for Hinglish text, aiming to provide an effective solution for analyzing sentiment in the Hinglish language.
đ Quick Start
Basic Usage
from transformers import BertTokenizer, BertForSequenceClassification
tokenizerg = BertTokenizer.from_pretrained("/content/model")
modelg = BertForSequenceClassification.from_pretrained("/content/model")
text = "kuch bhi type karo hinglish mai"
encoded_input = tokenizerg(text, return_tensors='pt')
output = modelg(**encoded_input)
print(output)
⨠Features
- This model is designed to work well with Hinglish data, which is widely used in India.
- It can classify sentiment into three labels: negative, neutral, and positive.
đĻ Installation
The README does not provide specific installation steps, so this section is skipped.
đģ Usage Examples
Basic Usage
from transformers import BertTokenizer, BertForSequenceClassification
tokenizerg = BertTokenizer.from_pretrained("/content/model")
modelg = BertForSequenceClassification.from_pretrained("/content/model")
text = "kuch bhi type karo hinglish mai"
encoded_input = tokenizerg(text, return_tensors='pt')
output = modelg(**encoded_input)
print(output)
đ Documentation
Model description
The model was trained on a small amount of reviews dataset.
Intended uses & limitations
The author wanted a model that could work well with Hinglish data, which is mostly used in India. However, the training data was limited.
Limitations and bias
The data only contains Hinglish codemixed text and is very limited. The author may update the model if more data is available.
Training data
The training data contains labeled data for 3 labels. The model was tuned based on the pre - trained model at https://huggingface.co/rohanrajpal/bert-base-multilingual-codemixed-cased-sentiment.
BibTeX entry and citation info
title = "{GLUEC}o{S}: An Evaluation Benchmark for Code-Switched {NLP}",
author = "Khanuja, Simran and
Dandapat, Sandipan and
Srinivasan, Anirudh and
Sitaram, Sunayana and
Choudhury, Monojit",
booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
month = jul,
year = "2020",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2020.acl-main.329",
pages = "3575--3585"
}
đ License
The model is licensed under the Apache 2.0 license.