bert - sentiment - analisis - indo开源模型 - 免费为印尼语文本做正负情感分类

首页

Bert Sentiment Analisis Indo

由 bibrani 开发

这是一个基于BERT架构的印尼语情感分析模型，能够将文本分类为正面或负面情感。

文本分类

Safetensors

其他开源协议:MIT #印尼语情感分析 #BERT微调 #高精度分类

下载量 39

发布时间 : 3/20/2025

模型简介

该模型经过微调，专门用于印尼语文本的情感分析任务，能够准确识别文本中的情感倾向。

模型特点

高准确率

在评估数据集上取得了0.91的准确率，正面情感分类的F1分数达到0.93。

印尼语优化

专门针对印尼语文本进行微调，能够更好地理解印尼语的语言特点。

高效推理

基于BERT架构，能够在合理时间内完成文本分类任务。

模型能力

印尼语文本情感分析

二元情感分类（正面/负面）

自然语言处理

使用案例

社交媒体分析

评论情感分析

分析社交媒体上用户评论的情感倾向

准确识别正面和负面评论

客户反馈分析

产品评价分类

自动分类电商平台上的产品评价

帮助企业快速了解客户满意度

🚀 基于BERT的印尼语情感分析模型

本仓库包含一个经过微调的BERT模型，用于进行情感分析。该模型经过训练，可将文本分为两种情感类别：0（负面）和1（正面）。以下是该模型的性能和训练细节总结。

🚀 快速开始

安装依赖

确保你已经安装了必要的库：

pip install transformers torch

加载模型

你可以使用transformers库加载经过微调的BERT模型：

from transformers import BertForSequenceClassification, BertTokenizer
## 加载经过微调的模型和分词器
model = BertForSequenceClassification.from_pretrained("path_to_model")
tokenizer = BertTokenizer.from_pretrained("path_to_tokenizer")

预处理和预测

对你的输入文本进行预处理并进行预测：

# prompt: use this model to predict a sentence with output sentiment negatif or positif

from transformers import BertTokenizer, BertForSequenceClassification
import torch

# 加载保存的模型和分词器
model_path = 'bibrani/bert-sentiment-analisis-indo'
tokenizer = BertTokenizer.from_pretrained(model_path)
model = BertForSequenceClassification.from_pretrained(model_path)

# 设置设备
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
print(device)

def predict_sentiment(text):
    """预测给定文本的情感。

    参数:
        text (str): 输入文本。

    返回:
        str: "Negative sentiment" 或 "Positive sentiment"。
    """
    # 对输入文本进行分词
    inputs = tokenizer(text, padding="max_length", truncation=True, max_length=512, return_tensors="pt")

    # 将输入移动到设备上
    input_ids = inputs.input_ids.to(device)
    attention_mask = inputs.attention_mask.to(device)

    # 进行推理
    with torch.no_grad():
        outputs = model(input_ids, attention_mask=attention_mask)
        logits = outputs.logits

    # 获取预测的类别
    predicted_class = torch.argmax(logits, dim=1).item()

    if predicted_class == 0:
        return "Negative sentiment", inputs
    else:
        return "Positive sentiment", inputs

# 示例用法
text_to_predict = "jadi cerita nya saya sedang ingin makan spaghetti dengan meatball yang kalau menurut ekspektasi saya adalah bakso yang terbuat dari cingcang yang biasa digunakan di menu pasta , setelah sampai , ternyata bakso yang digunakan adalah bakso olahan yang biasa dipakai di tukang bakso , bahkan bentuk nya tidak bulat"
sentiment = predict_sentiment(text_to_predict)
print(f"Text: {text_to_predict}")
print(f"Sentiment: {sentiment}")

✨ 主要特性

该模型能够对印尼语文本进行情感分析，将其分类为积极或消极情感，在评估数据集上取得了较好的性能。

📦 安装指南

确保你已经安装了必要的库：

pip install transformers torch

💻 使用示例

基础用法

from transformers import BertForSequenceClassification, BertTokenizer
## 加载经过微调的模型和分词器
model = BertForSequenceClassification.from_pretrained("path_to_model")
tokenizer = BertTokenizer.from_pretrained("path_to_tokenizer")

高级用法

# prompt: use this model to predict a sentence with output sentiment negatif or positif

from transformers import BertTokenizer, BertForSequenceClassification
import torch

# 加载保存的模型和分词器
model_path = 'bibrani/bert-sentiment-analisis-indo'
tokenizer = BertTokenizer.from_pretrained(model_path)
model = BertForSequenceClassification.from_pretrained(model_path)

# 设置设备
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
print(device)

def predict_sentiment(text):
    """预测给定文本的情感。

    参数:
        text (str): 输入文本。

    返回:
        str: "Negative sentiment" 或 "Positive sentiment"。
    """
    # 对输入文本进行分词
    inputs = tokenizer(text, padding="max_length", truncation=True, max_length=512, return_tensors="pt")

    # 将输入移动到设备上
    input_ids = inputs.input_ids.to(device)
    attention_mask = inputs.attention_mask.to(device)

    # 进行推理
    with torch.no_grad():
        outputs = model(input_ids, attention_mask=attention_mask)
        logits = outputs.logits

    # 获取预测的类别
    predicted_class = torch.argmax(logits, dim=1).item()

    if predicted_class == 0:
        return "Negative sentiment", inputs
    else:
        return "Positive sentiment", inputs

# 示例用法
text_to_predict = "jadi cerita nya saya sedang ingin makan spaghetti dengan meatball yang kalau menurut ekspektasi saya adalah bakso yang terbuat dari cingcang yang biasa digunakan di menu pasta , setelah sampai , ternyata bakso yang digunakan adalah bakso olahan yang biasa dipakai di tukang bakso , bahkan bentuk nya tidak bulat"
sentiment = predict_sentiment(text_to_predict)
print(f"Text: {text_to_predict}")
print(f"Sentiment: {sentiment}")