roberta-base-go_emotions-onnx Open-source Model - Free Deployment for Multi-label Sentiment Analysis Tasks

Roberta Base Go Emotions Onnx

Developed by SamLowe

This is the ONNX version of the RoBERTa-base-go_emotions model, supporting full precision and INT8 quantization for multi-label emotion analysis tasks.

Text Classification

Transformers

EnglishOpen Source License:MIT #Emotion Multi-label Classification #ONNX Acceleration #INT8 Quantization

Downloads 41.50k

Release Time : 9/28/2023

Model Overview

An emotion analysis model based on the RoBERTa architecture, supporting multi-label classification to identify multiple emotions in text. Offers both full-precision and INT8 quantized ONNX formats for optimized inference speed.

Model Features

ONNX Optimization

Provides both full-precision and INT8 quantized ONNX formats, significantly improving inference speed.

Efficient Inference

On an 8-core 11th Gen i7 CPU, the quantized model is approximately 5 times faster than the original Transformers model (with batch size 1).

Multi-label Classification

Capable of identifying multiple emotions in text simultaneously, suitable for complex emotion analysis scenarios.

Preserved Accuracy

The quantized model maintains nearly the same accuracy while significantly reducing model size.

Model Capabilities

Emotion Analysis

Multi-label Classification

Text Understanding

Use Cases

Emotion Analysis

Social Media Sentiment Monitoring

Analyze user emotions in social media posts to identify multiple emotional tendencies.

Accurately identifies emotional labels such as appreciation and gratitude.

Customer Feedback Analysis

Process customer feedback text to automatically classify multiple emotional dimensions.

Helps businesses quickly understand customer emotion distribution.

🚀 ONNX Model for Emotion Classification

This model is designed for text classification, specifically for multi - class and multi - label emotion classification. It leverages the ONNX format to offer faster inference, especially for smaller batch sizes, while maintaining high accuracy.

✨ Features

ONNX Version: Available in both full - precision and quantized (INT8) versions.
Performance: Faster inference compared to normal Transformers, especially for small batch sizes.
Accuracy: Metrics are comparable to the original Transformers model.
Size: The quantized model is one - quarter the size of the full - precision model.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

sentences = ["ONNX is seriously fast for small batches. Impressive"]

from transformers import AutoTokenizer, pipeline
from optimum.onnxruntime import ORTModelForSequenceClassification

model_id = "SamLowe/roberta-base-go_emotions-onnx"
file_name = "onnx/model_quantized.onnx"

model = ORTModelForSequenceClassification.from_pretrained(model_id, file_name=file_name)
tokenizer = AutoTokenizer.from_pretrained(model_id)

onnx_classifier = pipeline(
    task="text-classification",
    model=model,
    tokenizer=tokenizer,
    top_k=None,
    function_to_apply="sigmoid",  # optional as is the default for the task
)

model_outputs = onnx_classifier(sentences)
# gives a list of outputs, each a list of dicts (one per label)

print(model_outputs)
# E.g.
# [[{'label': 'admiration', 'score': 0.9203393459320068},
#   {'label': 'approval', 'score': 0.0560273639857769},
#   {'label': 'neutral', 'score': 0.04265536740422249},
#   {'label': 'gratitude', 'score': 0.015126707963645458},
# ...

Advanced Usage

from tokenizers import Tokenizer
import onnxruntime as ort

from os import cpu_count
import numpy as np  # only used for the postprocessing sigmoid

sentences = ["hello world"]  # for example a batch of 1

# labels as (ordered) list - from the go_emotions dataset
labels = ['admiration', 'amusement', 'anger', 'annoyance', 'approval', 'caring', 'confusion', 'curiosity', 'desire', 'disappointment', 'disapproval', 'disgust', 'embarrassment', 'excitement', 'fear', 'gratitude', 'grief', 'joy', 'love', 'nervousness', 'optimism', 'pride', 'realization', 'relief', 'remorse', 'sadness', 'surprise', 'neutral']

tokenizer = Tokenizer.from_pretrained("SamLowe/roberta-base-go_emotions")

# Optional - set pad to only pad to longest in batch, not a fixed length.
# (without this, the model will run slower, esp for shorter input strings)
params = {**tokenizer.padding, "length": None}
tokenizer.enable_padding(**params)

tokens_obj = tokenizer.encode_batch(sentences)

def load_onnx_model(model_filepath):
    _options = ort.SessionOptions()
    _options.inter_op_num_threads, _options.intra_op_num_threads = cpu_count(), cpu_count()
    _providers = ["CPUExecutionProvider"]  # could use ort.get_available_providers()
    return ort.InferenceSession(path_or_bytes=model_filepath, sess_options=_options, providers=_providers)

model = load_onnx_model("path_to_model_dot_onnx_or_model_quantized_dot_onnx")
output_names = [model.get_outputs()[0].name]  # E.g. ["logits"]

input_feed_dict = {
  "input_ids": [t.ids for t in tokens_obj],
  "attention_mask": [t.attention_mask for t in tokens_obj]
}

logits = model.run(output_names=output_names, input_feed=input_feed_dict)[0]
# produces a numpy array, one row per input item, one col per label

def sigmoid(x):
  return 1.0 / (1.0 + np.exp(-x))

# Post-processing. Gets the scores per label in range.
# Auto done by Transformers' pipeline, but we must do it manually with ORT.
model_outputs = sigmoid(logits) 

# for example, just to show the top result per input item
for probas in model_outputs:
  top_result_index = np.argmax(probas)
  print(labels[top_result_index], "with score:", probas[top_result_index])

📚 Documentation

Full precision ONNX version

onnx/model.onnx is the full - precision ONNX version:

It has identical accuracy/metrics to the original Transformers model.
It has the same model size (499MB).
It is faster in inference than normal Transformers, particularly for smaller batch sizes. In tests on an 8 - core 11th - gen i7 CPU using ONNXRuntime, it is about 2x to 3x as fast for a batch size of 1.

Metrics

Using a fixed threshold of 0.5 to convert the scores to binary predictions for each label:

Property	Details
Accuracy	0.474
Precision	0.575
Recall	0.396
F1	0.450

See more details in the SamLowe/roberta - base - go_emotions model card for the increases possible through selecting label - specific thresholds to maximise F1 scores, or another metric.

Quantized (INT8) ONNX version

onnx/model_quantized.onnx is the int8 quantized version:

It is one - quarter the size (125MB) of the full - precision model.
It delivers almost all of the accuracy.
It is faster in inference than both the full - precision ONNX and the normal Transformers model. On an 8 - core 11th - gen i7 CPU using ONNXRuntime, it is about 2x as fast as the full - precision model for a batch size of 1, which makes it circa 5x as fast as the full - precision normal Transformers model.

Metrics for Quantized (INT8) Model

Using a fixed threshold of 0.5 to convert the scores to binary predictions for each label:

Property	Details
Accuracy	0.475
Precision	0.582
Recall	0.398
F1	0.447

Note that the metrics are almost identical to the full - precision metrics.

Example notebook: showing usage, accuracy & performance

Notebook with more details to follow.

📄 License

This model is released under the MIT license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご