roberta-base-go_emotions-onnx開源模型 - 免費部署助力多標籤情感分析任務

首頁

Roberta Base Go Emotions Onnx

由SamLowe開發

這是RoBERTa-base-go_emotions模型的ONNX版本，支持全精度和INT8量化，用於多標籤情感分析任務。

文本分類

Transformers

英語開源協議:MIT #情感多標籤分類 #ONNX加速 #INT8量化

下載量 41.50k

發布時間 : 9/28/2023

模型概述

基於RoBERTa架構的情感分析模型，支持多標籤分類，能夠識別文本中的多種情感。提供全精度和INT8量化兩種ONNX格式，優化推理速度。

模型特點

ONNX優化

提供全精度和INT8量化兩種ONNX格式，顯著提升推理速度。

高效推理

在8核11代i7 CPU上，量化模型比原始Transformers模型快約5倍（批量大小為1時）。

多標籤分類

能夠同時識別文本中的多種情感，適用於複雜情感分析場景。

保持精度

量化模型在保持近乎相同精度的前提下，大幅減小模型體積。

模型能力

情感分析

多標籤分類

文本理解

使用案例

情感分析

社交媒體情緒監測

分析社交媒體帖子中的用戶情緒，識別多種情感傾向。

可準確識別讚賞、感激等多種情感標籤

客戶反饋分析

處理客戶反饋文本，自動分類多種情感維度。

幫助企業快速瞭解客戶情緒分佈

🚀 文本分類ONNX模型

本項目提供了一個基於ONNX的文本分類模型，可用於多類別和多標籤的情感分類任務。該模型基於RoBERTa架構，在go_emotions數據集上進行訓練，具有高精度和快速推理的特點。

✨ 主要特性

ONNX版本：提供全精度和INT8量化兩種ONNX版本，可根據需求選擇。
高精度：與原始Transformers模型具有相同的準確率和指標。
快速推理：在推理速度上比普通Transformers模型更快，尤其是對於小批量數據。
小模型尺寸：量化版本的模型尺寸僅為全精度模型的四分之一。

📦 安裝指南

本項目未提及具體安裝步驟，可根據使用的庫（如transformers、optimum、onnxruntime等）進行安裝。例如，使用以下命令安裝所需庫：

pip install transformers optimum onnxruntime

💻 使用示例

基礎用法

使用Optimum庫的ONNX類進行文本分類：

sentences = ["ONNX is seriously fast for small batches. Impressive"]

from transformers import AutoTokenizer, pipeline
from optimum.onnxruntime import ORTModelForSequenceClassification

model_id = "SamLowe/roberta-base-go_emotions-onnx"
file_name = "onnx/model_quantized.onnx"

model = ORTModelForSequenceClassification.from_pretrained(model_id, file_name=file_name)
tokenizer = AutoTokenizer.from_pretrained(model_id)

onnx_classifier = pipeline(
    task="text-classification",
    model=model,
    tokenizer=tokenizer,
    top_k=None,
    function_to_apply="sigmoid",  # optional as is the default for the task
)

model_outputs = onnx_classifier(sentences)
# gives a list of outputs, each a list of dicts (one per label)

print(model_outputs)
# E.g.
# [[{'label': 'admiration', 'score': 0.9203393459320068},
#   {'label': 'approval', 'score': 0.0560273639857769},
#   {'label': 'neutral', 'score': 0.04265536740422249},
#   {'label': 'gratitude', 'score': 0.015126707963645458},
# ...

高級用法

使用ONNXRuntime進行文本分類：

from tokenizers import Tokenizer
import onnxruntime as ort

from os import cpu_count
import numpy as np  # only used for the postprocessing sigmoid

sentences = ["hello world"]  # for example a batch of 1

# labels as (ordered) list - from the go_emotions dataset
labels = ['admiration', 'amusement', 'anger', 'annoyance', 'approval', 'caring', 'confusion', 'curiosity', 'desire', 'disappointment', 'disapproval', 'disgust', 'embarrassment', 'excitement', 'fear', 'gratitude', 'grief', 'joy', 'love', 'nervousness', 'optimism', 'pride', 'realization', 'relief', 'remorse', 'sadness', 'surprise', 'neutral']

tokenizer = Tokenizer.from_pretrained("SamLowe/roberta-base-go_emotions")

# Optional - set pad to only pad to longest in batch, not a fixed length.
# (without this, the model will run slower, esp for shorter input strings)
params = {**tokenizer.padding, "length": None}
tokenizer.enable_padding(**params)

tokens_obj = tokenizer.encode_batch(sentences)

def load_onnx_model(model_filepath):
    _options = ort.SessionOptions()
    _options.inter_op_num_threads, _options.intra_op_num_threads = cpu_count(), cpu_count()
    _providers = ["CPUExecutionProvider"]  # could use ort.get_available_providers()
    return ort.InferenceSession(path_or_bytes=model_filepath, sess_options=_options, providers=_providers)

model = load_onnx_model("path_to_model_dot_onnx_or_model_quantized_dot_onnx")
output_names = [model.get_outputs()[0].name]  # E.g. ["logits"]

input_feed_dict = {
  "input_ids": [t.ids for t in tokens_obj],
  "attention_mask": [t.attention_mask for t in tokens_obj]
}

logits = model.run(output_names=output_names, input_feed=input_feed_dict)[0]
# produces a numpy array, one row per input item, one col per label

def sigmoid(x):
  return 1.0 / (1.0 + np.exp(-x))

# Post-processing. Gets the scores per label in range.
# Auto done by Transformers' pipeline, but we must do it manually with ORT.
model_outputs = sigmoid(logits) 

# for example, just to show the top result per input item
for probas in model_outputs:
  top_result_index = np.argmax(probas)
  print(labels[top_result_index], "with score:", probas[top_result_index])

📚 詳細文檔

模型版本

全精度ONNX版本：onnx/model.onnx，與原始Transformers模型具有相同的準確率和指標，模型大小為499MB，在推理速度上比普通Transformers模型更快，尤其是對於小批量數據。
量化（INT8）ONNX版本：onnx/model_quantized.onnx，模型尺寸為125MB，幾乎保留了全精度模型的所有準確率，推理速度比全精度ONNX版本和普通Transformers模型更快。