🚀 Model Card for Model ID
This model is designed to classify news articles from a Sri Lankan news source, Daily Mirror Online, into five categories.
🚀 Quick Start
Use the code below to get started with the model.
new_model = "Imasha17/News_classification.4"
from transformers import pipeline
pipe = pipeline("text-classification", model="Imasha17/News_classification.4")
text="Enter your news here"
pipe (text)
✨ Features
- Automatic categorization of Sri Lankan news articles.
- Can be used in news filtering and recommendation systems.
- Enables preliminary analysis of sentiment in news articles.
- News aggregation platforms can use it to categorize and sort articles.
- Journalists and researchers can analyze media trends based on category distributions.
📦 Installation
No specific installation steps provided in the original document.
📚 Documentation
Model Details
Model Description
This model is designed to classify news articles from the Daily Mirror Online, a Sri Lankan news source, into five categories: Business, Opinion, Political Gossip, Sports, and World News. And this model is developed to analyze and process news content for tasks such as sentiment analysis, or summarization.
Data Sources
The original dataset contained real news content of Daily Mirror. After preprocessing, 1,015 records were selected for training. The data split as 80% train and 20% validation.
Uses
Direct Use
The model can be used for:
- Automatic categorization of Sri Lankan news articles.
- News filtering and recommendation systems.
- Preliminary analysis of sentiment in news articles.
Downstream Use
News aggregation platforms can use the model to categorize and sort articles. Journalists and researchers can analyze media trends based on category distributions.
Out-of-Scope Use
This model should not be used for critical decision-making tasks such as political analysis, stock market predictions, or legal judgments. It may not generalize well to non - Sri Lankan news sources.
Bias, Risks, and Limitations
- The dataset is limited to Daily Mirror Online, which may introduce biases in classification.
- The model might misclassify articles if they contain mixed topics.
- The dataset size is small (1,015 articles), which may impact performance on diverse news sources.
Training Details
Training Data
The dataset comprises 1,015 preprocessed news articles from Daily Mirror Online.
Training Hyperparameters
- Training regime: [More Information Needed]
- Model Architecture: distilbert-base-uncased
- Batch Size: 4
- Epochs: 3
Testing Data, Factors & Metrics
Testing Data
20% of the dataset (203 articles) used for validation/testing.
Results
The model performed well, but misclassification occurs when articles have overlapping content.
Model Examination
The model effectively classifies Sri Lankan news articles. It can be fine-tuned on larger datasets for improved accuracy.
Model Architecture and Objective
- Model Architecture: distilbert-base-uncased
- Objective: Multiclass text classification