Indonesian-sentiment Open-source Model - Free Deployment for Sentiment Analysis and Classification of Indonesian Comments

Indonesian Sentiment

Developed by taufiqdp

Fine-tuned from a pre-trained Indonesian BERT model, used for sentiment analysis of Indonesian comments and reviews, and can classify text into three categories: negative, neutral, and positive.

Text Classification

Transformers

Open Source License:MIT #Indonesian sentiment analysis #BERT fine-tuning #Comment classification

Downloads 1,830

Release Time : 10/25/2023

Model Overview

This model is a fine-tuned version of IndoBERT Base Uncased, specifically designed for the sentiment analysis task of Indonesian text.

Model Features

Dedicated to Indonesian

Based on a pre-trained BERT model for Indonesian, optimized specifically for Indonesian text

Three-category sentiment analysis

Can accurately classify Indonesian comment text into three categories: negative, neutral, or positive

High performance

Achieved an accuracy of 95.69% on the evaluation dataset

Model Capabilities

Indonesian text classification

Sentiment analysis

Comment and review analysis

Use Cases

Customer feedback analysis

Product review analysis

Analyze the sentiment tendency of Indonesian product reviews on e-commerce platforms

Accurately identify users' satisfaction with products

Service evaluation monitoring

Monitor the sentiment tendency of hotel or restaurant service evaluations

Detect service problems in a timely manner and make improvements

Social media monitoring

Brand reputation monitoring

Analyze the sentiment tendency of Indonesian discussions about brands on social media

Understand the public's overall attitude towards the brand

🚀 Indonesian Sentiment Analysis Model

This model is a fine - tuned version of an Indonesian pre - trained BERT model, designed to perform sentiment analysis on Indonesian comments and reviews, offering valuable insights into public opinion.

🚀 Quick Start

You can load the model and perform inference as follows:

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("taufiqdp/indonesian-sentiment")
model = AutoModelForSequenceClassification.from_pretrained("taufiqdp/indonesian-sentiment")

class_names = ['negatif', 'netral', 'positif']

text = "Pelayanan lama dan tidak ramah"
tokenized_text = tokenizer(text, return_tensors='pt')

with torch.inference_mode():
    logits = model(**tokenized_text)['logits']

result = class_names[logits.argmax(dim=1)]
print(result)

✨ Features

Fine - tuned BERT: Based on IndoBERT Base Uncased, a BERT model pre - trained on Indonesian text data.
Multi - class Classification: Classifies Indonesian review text into three sentiment categories: Negative, Neutral, and Positive.

📦 Installation

No specific installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

Basic Usage

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("taufiqdp/indonesian-sentiment")
model = AutoModelForSequenceClassification.from_pretrained("taufiqdp/indonesian-sentiment")

class_names = ['negatif', 'netral', 'positif']

text = "Pelayanan lama dan tidak ramah"
tokenized_text = tokenizer(text, return_tensors='pt')

with torch.inference_mode():
    logits = model(**tokenized_text)['logits']

result = class_names[logits.argmax(dim=1)]
print(result)

📚 Documentation

Model Details

This model is a fine - tuned version of IndoBERT Base Uncased, a BERT model pre - trained on Indonesian text data. It was fine - tuned to perform sentiment analysis on Indonesian comments and reviews.

The model was trained on indonlu (SmSA) and indonesian_sentiment datasets.

The model classifies a given Indonesian review text into one of three categories:

Negative
Neutral
Positive

Training hyperparameters

train_batch_size: 32
eval_batch_size: 32
learning_rate: 1e - 4
optimizer: AdamW with betas=(0.9, 0.999), eps = 1e - 8, and weight_decay = 0.01
epochs: 3
learning_rate_scheduler: StepLR with step_size = 592, gamma = 0.1

Training Results

The following table shows the training results for the model:

Epoch	Loss	Accuracy
1	0.2936	0.9310
2	0.1212	0.9526
3	0.0795	0.9569

How to Use

You can load the model and perform inference as shown in the Usage Examples section.

🔧 Technical Details

The model is a fine - tuned BERT model, which is a powerful architecture for natural language processing tasks. It leverages the pre - trained knowledge on Indonesian text data from IndoBERT Base Uncased and further fine - tunes it on specific sentiment analysis datasets. The training hyperparameters are carefully selected to optimize the model's performance on sentiment classification.

📄 License

This model is released under the MIT license.

Citation

@misc{koto2020indolem,
      title={IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP}, 
      author={Fajri Koto and Afshin Rahimi and Jey Han Lau and Timothy Baldwin},
      year={2020},
      eprint={2011.00677},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
@inproceedings{purwarianti2019improving,
  title={Improving Bi - LSTM Performance for Indonesian Sentiment Analysis Using Paragraph Vector},
  author={Ayu Purwarianti and Ida Ayu Putu Ari Crisdayanti},
  booktitle={Proceedings of the 2019 International Conference of Advanced Informatics: Concepts, Theory and Applications (ICAICTA)},
  pages={1--5},
  year={2019},
  organization={IEEE}
}

Additional Information

Property	Details
Model Type	Fine - tuned BERT for sentiment analysis
Training Data	indonlu (`SmSA`) and indonesian_sentiment
Pipeline Tag	text - classification
Base Model	IndoBERT Base Uncased
Datasets	indonlp/indonlu, sepidmnorozy/Indonesian_sentiment
Widget Example	Text: "Pelayanan lama, sangat tidak memuaskan."; Example Title: "Sentiment analysis"

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご