GNER-LLaMA-7Bオープンソースモデル - 無料デプロイでゼロショットシナリオのエンティティ認識タスクを支援

ホーム

GNER LLaMA 7B

dyyyyyyyyによって開発

GNER-LLaMA-7BはLLaMAアーキテクチャに基づく生成的固有表現認識モデルで、ゼロショットシナリオにおけるエンティティ認識タスクに特化しています。

シーケンスラベリング

Transformers

英語オープンソースライセンス:Apache-2.0 #生成的NER #ゼロショット学習 #マルチドメインエンティティ認識

ダウンロード数 38

リリース時間 : 2/27/2024

モデル概要

このモデルは生成的アプローチで固有表現認識を行い、負のサンプル再構築技術を通じて未見領域の認識能力を向上させ、複数のエンティティタイプの認識をサポートします。

モデル特徴

ゼロショット認識能力

未経験のエンティティ領域でも優れたゼロショット認識能力を発揮

負のサンプル訓練

負のサンプルを訓練プロセスに組み込むことで性能を大幅に向上

マルチモデルサポート

LLaMAとFlan-T5という2つの代表的生成モデルを基盤

モデル能力

テキスト生成

固有表現認識

ゼロショット学習

使用事例

情報抽出

映像分野のエンティティ認識

映画作品における俳優、監督、公開年などのエンティティを識別

テストデータで66.1のF1値を達成

クロスドメインエンティティ認識

未経験の分野でのエンティティ認識

現在の最良ソリューションを8-11ポイントのF1値で上回る

🚀 生成的名前付きエンティティ認識のための負例の再考

本プロジェクトでは、生成的名前付きエンティティ認識（GNER）フレームワークを導入しています。このフレームワークは、未見のエンティティドメインに対するゼロショット能力を向上させます。LLaMAやFlan - T5などの生成モデルでの実験結果から、学習プロセスに負例を組み込むことで大幅な性能向上が得られることが示されています。

🚀 クイックスタート

GNERは、生成的名前付きエンティティ認識のための革新的なフレームワークです。このフレームワークは、未見のエンティティドメインに対するゼロショット能力を強化し、LLaMAやFlan - T5などの代表的な生成モデルに適用することで、最先端のアプローチを大きく上回る性能を達成します。

💻 コード: https://github.com/yyDing1/GNER/
📖 論文: Rethinking Negative Instances for Generative Named Entity Recognition
💾 🤗 HuggingFace Hubのモデル: GNER - Models
🧪 再現材料: Reproduction Materials
🎨 サンプルJupyterノートブック: GNER Notebook

✨ 主な機能

生成的名前付きエンティティ認識のためのフレームワークを提供します。
未見のエンティティドメインに対するゼロショット能力を強化します。
学習プロセスに負例を組み込むことで、性能を大幅に向上させます。

📦 インストール

依存関係をインストールする必要があります。

pip install torch datasets deepspeed accelerate transformers protobuf

💻 使用例

基本的な使用法

以下はGNER - LLaMAを使用した簡単な推論の例です。

>>> import torch
>>> from transformers import AutoTokenizer, AutoModelForCausalLM
>>> tokenizer = AutoTokenizer.from_pretrained("dyyyyyyyy/GNER-LLaMA-7B")
>>> model = AutoModelForCausalLM.from_pretrained("dyyyyyyyy/GNER-LLaMA-7B", torch_dtype=torch.bfloat16).cuda()
>>> model = model.eval()
>>> instruction_template = "Please analyze the sentence provided, identifying the type of entity for each word on a token-by-token basis.\nOutput format is: word_1(label_1), word_2(label_2), ...\nWe'll use the BIO-format to label the entities, where:\n1. B- (Begin) indicates the start of a named entity.\n2. I- (Inside) is used for words within a named entity but are not the first word.\n3. O (Outside) denotes words that are not part of a named entity.\n"
>>> sentence = "did george clooney make a musical in the 1980s"
>>> entity_labels = ["genre", "rating", "review", "plot", "song", "average ratings", "director", "character", "trailer", "year", "actor", "title"]
>>> instruction = f"{instruction_template}\nUse the specific entity tags: {', '.join(entity_labels)} and O.\nSentence: {sentence}"
>>> instruction = f"[INST] {instruction} [/INST]"
>>> inputs = tokenizer(instruction, return_tensors="pt").to("cuda")
>>> outputs = model.generate(**inputs, max_new_tokens=640)
>>> response = tokenizer.decode(outputs[0], skip_special_tokens=True)
>>> response = response[response.find("[/INST]") + len("[/INST]"):].strip()
>>> print(response)
"did(O) george(B-actor) clooney(I-actor) make(O) a(O) musical(B-genre) in(O) the(O) 1980s(B-year)"

📚 ドキュメント

事前学習済みモデル

LLaMA（7B）とFlan - T5（base、large、xl、xxl）に基づく5つのGNERモデルをリリースしています。

モデル	パラメータ数	ゼロショット平均$F_1$	教師あり平均$F_1$	🤗 HuggingFace ダウンロードリンク
GNER - LLaMA	7B	66.1	86.09	リンク
GNER - T5 - base	248M	59.5	83.21	リンク
GNER - T5 - large	783M	63.5	85.45	リンク
GNER - T5 - xl	3B	66.1	85.94	リンク
GNER - T5 - xxl	11B	69.1	86.15	リンク

引用

@misc{ding2024rethinking,
      title={Rethinking Negative Instances for Generative Named Entity Recognition}, 
      author={Yuyang Ding and Juntao Li and Pinzheng Wang and Zecheng Tang and Bowen Yan and Min Zhang},
      year={2024},
      eprint={2402.16602},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}