đ Sentiment Analysis Model using DistilBERT
This repository contains a sentiment analysis model fine - tuned on the IMDb movie reviews dataset with DistilBERT architecture. It can classify text inputs into positive or negative sentiment categories.
đ Quick Start
To use the model, you need to install the transformers
library from Hugging Face. You can install it using the following command:
pip install transformers
Once installed, you can use the following code to classify text with this model:
from transformers import DistilBertTokenizer, DistilBertForSequenceClassification
import torch
tokenizer = DistilBertTokenizer.from_pretrained(Pranav-10/Sentimental_Analysis)
model = DistilBertForSequenceClassification.from_pretrained(Pranav-10/Sentimental_Analysis)
text = "I loved this movie. The performances were fantastic!"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512)
with torch.no_grad():
logits = model(**inputs).logits
probabilities = torch.softmax(logits, dim=-1)
print(probabilities)
⨠Features
- Based on DistilBERT: The model is built on the DistilBERT architecture, which is a smaller, faster, cheaper, and lighter version of BERT. It retains most of BERT's performance while being more efficient, making it ideal for sentiment analysis tasks where size and speed are crucial.
- Fine - tuned on IMDb: The model has been fine - tuned on the IMDb dataset, which contains 50,000 movie reviews labeled as positive or negative.
đ Documentation
Model Description
The model is based on the DistilBERT architecture. DistilBERT is a more efficient alternative to BERT, retaining most of its performance. It has been fine - tuned on the IMDb dataset, which consists of 50,000 movie reviews labeled as positive or negative. This makes it well - suited for sentiment analysis tasks.
Evaluation Results
The model achieved the following performance on the IMDb dataset:
Property |
Details |
Accuracy |
90% |
Precision |
89% |
Recall |
91% |
F1 Score |
90% |
These results show the model's high efficiency in classifying sentiments as positive or negative. |
|
Training Procedure
The model was trained using the following steps:
- Pre - processing: The dataset was pre - processed by converting all reviews to lowercase and tokenizing using the DistilBERT tokenizer.
- Optimization: The Adam optimizer was used with a learning rate of 2e - 5, a batch size of 16, and the model was trained for 3 epochs.
- Hardware: Training was performed on a single NVIDIA GTX 1650 GPU.
đ License
This project is licensed under the Apache 2.0 license.