YouTube-XLM-Roberta-Base-Sentiment-Multilingual Open-Source Model - Accurately Analyze Sentiments of YouTube Comments

Youtube Xlm Roberta Base Sentiment Multilingual

Developed by AmaanP314

Fine-tuned YouTube comment sentiment analysis model based on cardiffnlp/twitter-xlm-roberta-base-sentiment-multilingual, with an accuracy of 80.17%

Text Classification

Safetensors

#YouTube Comment Sentiment Analysis #Multilingual Support #Fine-tuned RoBERTa

Downloads 91

Release Time : 2/23/2025

Model Overview

A fine-tuned model specifically designed for YouTube comment sentiment analysis, capable of identifying positive, neutral, and negative sentiments, suitable for video recommendation systems and content analysis scenarios

Model Features

Domain Adaptation Optimization

Specifically optimized for the slang and structural characteristics of YouTube comments

Multilingual Support

Based on XLM-RoBERTa architecture, supports multilingual sentiment analysis

High Accuracy

Achieves 80.17% accuracy on YouTube comment test sets

Model Capabilities

Text Sentiment Classification

Multilingual Text Processing

Short Text Analysis

Use Cases

Content Recommendation

Video Recommendation System

Optimize video recommendation algorithms based on comment sentiment

Enhance user viewing experience

Content Analysis

Audience Feedback Dashboard

Automatically analyze sentiment tendencies in video comments

Help creators understand audience reactions

🚀 Finetuned RoBERTa Sentiment Model

This is a fine - tuned version of the RoBERTa model for sentiment analysis of YouTube comments, achieving high accuracy.

🚀 Quick Start

The model can be used via an API endpoint or loaded locally using the Hugging Face Transformers library. For example, using Python:

Basic Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "AmaanP314/youtube-xlm-roberta-base-sentiment-multilingual"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Example input
comments = [
    "This video aged like honey.", # Positive
    "This video aged like milk.", # Negative
    "It was just okay." # Neutral
]

inputs = tokenizer(comments, return_tensors="pt", padding=True, truncation=True)
with torch.no_grad():
    outputs = model(**inputs)
predictions = torch.argmax(outputs.logits, dim=1)
label_mapping = {0: "Negative", 1: "Neutral", 2: "Positive"}
sentiments = [label_mapping[p.item()] for p in predictions]
print(sentiments)

✨ Features

Domain - Specific Fine - Tuning: Fine - tuned on a custom dataset of YouTube comments to improve performance on sentiment analysis for this specific domain.
High Accuracy: Achieved an accuracy of 80.17% on the YouTube comments dataset.
Multiple Sentiment Labels: Returns sentiment labels of Positive, Neutral, and Negative for each input comment.

📦 Installation

The model can be installed by loading it from the Hugging Face model hub using the Transformers library. No additional installation steps are required other than having the transformers and torch libraries installed in your Python environment.

📚 Documentation

Model Overview

This model is a version of the cardiffnlp/twitter-xlm-roberta-base-sentiment-multilingual model that has been fine - tuned on a custom dataset of YouTube comments. The fine - tuning process was designed to improve performance on sentiment analysis for YouTube comments, which often differ in tone, slang, and structure from other social media platforms. After fine - tuning, the model achieved an accuracy of 80.17%.

Intended Use

The model is designed for sentiment analysis of YouTube comments. It accepts a list of text inputs (comments) and returns a sentiment label for each comment:

Positive
Neutral
Negative

This model can be used in applications such as video recommendation systems, content analysis dashboards, and other data analysis tasks where understanding audience sentiment is important.

How It Was Trained

Base Model: cardiffnlp/twitter-xlm-roberta-base-sentiment-multilingual
Dataset: A custom dataset consisting of over 1 million YouTube comments, each annotated with one of three sentiment labels (Positive, Neutral, Negative).
Fine - Tuning Process: The model was fine - tuned using the following steps:
- Data Cleaning and Preprocessing
- Model Fine - Tuning
- Evaluation and Testing

Evaluation

The model was evaluated on a held - out test set of YouTube comments. It improved from a baseline accuracy of approximately 69.3% (when fine - tuned on Twitter data) to 80.17% on this dataset. This improvement demonstrates the benefit of domain - specific fine - tuning.

🔧 Technical Details

The model is based on the RoBERTa architecture and fine - tuned on a large dataset of YouTube comments. The fine - tuning process involves adjusting the model's weights to better fit the characteristics of YouTube comments, such as the unique language style and sentiment distribution. The evaluation on the test set shows that the fine - tuned model has a significant improvement in accuracy compared to the model fine - tuned on Twitter data.

📄 License

The model is licensed under the CC - BY - 4.0 license.

Information Table

Property	Details
Model Type	Fine - tuned RoBERTa for sentiment analysis
Training Data	A custom dataset of over 1 million YouTube comments from AmaanP314/youtube - comment - sentiment
Metrics	Accuracy
Base Model	cardiffnlp/twitter - xlm - roberta - base - sentiment - multilingual
Pipeline Tag	Text Classification
Tags	youtube, comments, sentiment, roberta

Citation

If you use this model in your research, please cite the original base model and this project:

@misc{cardiffnlp,
  title={Twitter-XLM-RoBERTa-Base-Sentiment-Multilingual},
  author={Cardiff NLP},
  year={2020},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/cardiffnlp/twitter-xlm-roberta-base-sentiment-multilingual}}
}

@misc{AmaanP314,
  title={Youtube-XLM-RoBERTa-Base-Sentiment-Multilingual},
  author={Amaan Poonawala},
  year={2025},
  howpublished={\url{https://huggingface.co/AmaanP314/youtube-xlm-roberta-base-sentiment-multilingual}}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご