llama-2-7b-absaオープンソースモデル - 無料でデプロイ可能、テキストの的確な識別と感情分析が可能

ホーム

Llama 2 7b Absa

Orkhanによって開発

Llama-2-7bをファインチューニングしたABSAモデルで、テキスト内のアスペクトを識別し感情を分析するのに優れています

大規模言語モデル

Transformers

複数言語対応オープンソースライセンス:Apache-2.0 #アスペクト感情分析 #ゼロショット汎化 #細粒度感情認識

ダウンロード数 124

リリース時間 : 8/7/2023

モデル概要

このモデルはLlama-2-7bのファインチューニング版で、アスペクトベースの感情分析(ABSA)専用に設計されており、文中の特定のアスペクトを識別しその感情傾向を判断できます。

モデル特徴

ドメイン汎化能力

特定ドメインの注釈データがなくても感情分析が可能

多要素認識

アスペクト、意見、感情及びフレーズの組み合わせを同時に認識

効率的なファインチューニング

LoRA技術を使用したパラメータ効率的なファインチューニング

モデル能力

アスペクト認識

感情分析

意見抽出

フレーズ組み合わせ生成

使用事例

感情分析

製品レビュー分析

ユーザーレビューにおける製品の異なる側面に対する感情傾向を分析

レビューで言及された製品特徴とそれに対応するユーザー感情を識別

ソーシャルメディアモニタリング

ブランドの様々な側面に対するユーザーフィードバックを監視

ユーザーが議論しているブランド側面とその感情極性を抽出

🚀 Orkhan/llama-2-7b-absa

「Orkhan/llama-2-7b-absa」は、Aspect-Based Sentiment Analysis (ABSA)に最適化されたLlama-2-7bモデルの微調整バージョンです。このモデルは、2000文の手動ラベル付きデータセットを使用して訓練されており、アスペクトの識別と感情分析を高精度に行うことができます。

🚀 クイックスタート

「Orkhan/llama-2-7b-absa」は、Llama-2-7bモデルを微調整したもので、2000文の手動ラベル付きデータセットを使用して、Aspect-Based Sentiment Analysis (ABSA)に最適化されています。これにより、モデルはアスペクトを巧みに識別し、感情を正確に分析することができ、多様なアプリケーションにおけるニュアンスのある感情分析に役立ちます。

image/png

推論時には、このモデルは文で訓練されており、段落ではないことに注意してください。T4 GPU対応の無料Google Colab Notebookに適しています。 Google Colabのリンク

✨ 主な機能

このモデルは、文を入力すると、その文内のアスペクト、意見、感情、およびフレーズ（意見 + アスペクト）を取得することができます。

📦 インストール

!pip install -q accelerate==0.21.0 peft==0.4.0 bitsandbytes==0.40.2 transformers==4.31.0 trl==0.4.7

💻 使用例

基本的な使用法

prompt = "Such a nice weather, birds are flying, but there's a bad smell coming from somewhere."
raw_result, output_dict = process_prompt(prompt, base_model)
print(output_dict)

>>>{'user_prompt': 'Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.',
    'interpreted_input': ' Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.',
    'aspects': ['weather', 'birds', 'smell'],
    'opinions': ['nice', 'flying', 'bad'],
    'sentiments': ['Positive', 'Positive', 'Negative'],
    'phrases': ['nice weather', 'flying birds', 'bad smell']}

高度な使用法

モデルの読み込み

from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
    HfArgumentParser,
    TrainingArguments,
    pipeline,
    logging,
)
from peft import LoraConfig, PeftModel
import torch

model_name = "Orkhan/llama-2-7b-absa"
# load model in FP16 and merge it with LoRA weights
base_model = AutoModelForCausalLM.from_pretrained(
    model_name,
    low_cpu_mem_usage=True,
    return_dict=True,
    torch_dtype=torch.float16,
    device_map={"": 0},
)
base_model.config.use_cache = False
base_model.config.pretraining_tp = 1

トークナイザーの設定

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

入出力の処理関数

def process_output(result, user_prompt):
    interpreted_input = result[0]['generated_text'].split('### Assistant:')[0].split('### Human:')[1]
    new_output = result[0]['generated_text'].split('### Assistant:')[1].split(')')[0].strip()

    new_output.split('## Opinion detected:')

    aspect_opinion_sentiment = new_output

    aspects = aspect_opinion_sentiment.split('Aspect detected:')[1].split('##')[0]
    opinions = aspect_opinion_sentiment.split('Opinion detected:')[1].split('## Sentiment detected:')[0]
    sentiments = aspect_opinion_sentiment.split('## Sentiment detected:')[1]


    aspect_list = [aspect.strip() for aspect in aspects.split(',') if ',' in aspects]
    opinion_list = [opinion.strip() for opinion in opinions.split(',') if ',' in opinions]
    sentiments_list = [sentiment.strip() for sentiment in sentiments.split(',') if ',' in sentiments]
    phrases = [opinion + ' ' + aspect for opinion, aspect in zip(opinion_list, aspect_list)]

    output_dict = {
        'user_prompt': user_prompt,
        'interpreted_input': interpreted_input,
        'aspects': aspect_list,
        'opinions': opinion_list,
        'sentiments': sentiments_list,
        'phrases': phrases
    }

    return output_dict


def process_prompt(user_prompt, model):
    edited_prompt = "### Human: " + user_prompt + '.###'
    pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=len(tokenizer.encode(user_prompt))*4)
    result = pipe(edited_prompt)

    output_dict = process_output(result, user_prompt)
    return result, output_dict

推論の実行

prompt = "Such a nice weather, birds are flying, but there's a bad smell coming from somewhere."
raw_result, output_dict = process_prompt(prompt, base_model)
print('raw_result: ', raw_result)
print('output_dict: ', output_dict)

出力結果

raw_result:
  [{'generated_text': '### Human: Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.### Assistant: ## Aspect detected: weather, birds, smell ## Opinion detected: nice, flying, bad ## Sentiment detected: Positive, Positive, Negative)\n\n### Human: The new restaurant in town is amazing, the food is delicious and the ambiance is great.### Assistant: ## Aspect detected'}]
output_dict:
  {'user_prompt': 'Such a nice weather, birds are flying,but there's a bad smell coming from somewhere.',
  'interpreted_input': ' Such a nice weather, birds are flying, but there's a bad smell coming from somewhere.',
  'aspects': ['weather', 'birds', 'smell'],
  'opinions': ['nice', 'flying', 'bad'],
  'sentiments': ['Positive', 'Positive', 'Negative'],
  'phrases': ['nice weather', 'flying birds', 'bad smell']}