GNER-LLaMA-7B開源模型 - 免費部署助力零樣本場景實體識別任務

首頁

GNER LLaMA 7B

由dyyyyyyyy開發

GNER-LLaMA-7B是一個基於LLaMA架構的生成式命名實體識別模型，專注於零樣本場景下的實體識別任務。

序列標註

Transformers

英語開源協議:Apache-2.0 #生成式NER #零樣本學習 #多領域實體識別

下載量 38

發布時間 : 2/27/2024

模型概述

該模型採用生成式方法進行命名實體識別，通過負樣本重構技術提升在未見領域的識別能力，支持多種實體類型的識別。

模型特點

零樣本識別能力

在未見過的實體領域展現出更強的零樣本識別能力

負樣本訓練

通過將負樣本納入訓練過程顯著提升性能

多模型支持

基於LLaMA和Flan-T5兩大代表性生成模型

模型能力

文本生成

命名實體識別

零樣本學習

使用案例

信息提取

影視領域實體識別

識別影視作品中的演員、導演、年份等實體

在測試數據上達到66.1的F1值

跨領域實體識別

在未見過的領域進行實體識別

以8-11分的F1值優勢超越當前最優方案

🚀 重新思考生成式命名實體識別中的負實例

我們推出了 GNER，即生成式命名實體識別框架，它在未見實體領域展現出了卓越的零樣本能力。在兩個具有代表性的生成式模型（LLaMA 和 Flan - T5）上的實驗表明，將負實例融入訓練過程能夠顯著提升模型性能。由此得到的模型 GNER - LLaMA 和 GNER - T5 大幅超越了當前的最優方法，$F_1$ 分數分別提高了 8 分和 11 分。代碼和模型均已公開。

💻 代碼：https://github.com/yyDing1/GNER/
📖 論文：重新思考生成式命名實體識別中的負實例
💾 HuggingFace 模型庫：GNER 模型
🧪 復現材料：復現材料
🎨 Jupyter 示例筆記本：GNER 筆記本

🚀 快速開始

本項目提供了基於 LLaMA（7B）和 Flan - T5（base、large、xl 和 xxl）的五個 GNER 模型。

屬性	詳情
模型類型	生成式命名實體識別模型
訓練數據	Universal - NER/Pile - NER - type
評估指標	$F_1$ 分數
庫名稱	transformers
任務類型	文本生成
許可證	Apache - 2.0

✨ 主要特性

提出了 GNER 生成式命名實體識別框架，在未見實體領域具有出色的零樣本能力。
通過在訓練過程中融入負實例，顯著提升了模型性能。
基於 LLaMA 和 Flan - T5 構建的模型 GNER - LLaMA 和 GNER - T5 大幅超越當前最優方法。

📦 安裝指南

你需要安裝以下依賴：

pip install torch datasets deepspeed accelerate transformers protobuf

💻 使用示例

基礎用法

請參考 Jupyter 示例筆記本來了解如何使用 GNER 模型。

高級用法

以下是一個使用 GNER - LLaMA 的簡單推理示例：

>>> import torch
>>> from transformers import AutoTokenizer, AutoModelForCausalLM
>>> tokenizer = AutoTokenizer.from_pretrained("dyyyyyyyy/GNER-LLaMA-7B")
>>> model = AutoModelForCausalLM.from_pretrained("dyyyyyyyy/GNER-LLaMA-7B", torch_dtype=torch.bfloat16).cuda()
>>> model = model.eval()
>>> instruction_template = "Please analyze the sentence provided, identifying the type of entity for each word on a token-by-token basis.\nOutput format is: word_1(label_1), word_2(label_2), ...\nWe'll use the BIO-format to label the entities, where:\n1. B- (Begin) indicates the start of a named entity.\n2. I- (Inside) is used for words within a named entity but are not the first word.\n3. O (Outside) denotes words that are not part of a named entity.\n"
>>> sentence = "did george clooney make a musical in the 1980s"
>>> entity_labels = ["genre", "rating", "review", "plot", "song", "average ratings", "director", "character", "trailer", "year", "actor", "title"]
>>> instruction = f"{instruction_template}\nUse the specific entity tags: {', '.join(entity_labels)} and O.\nSentence: {sentence}"
>>> instruction = f"[INST] {instruction} [/INST]"
>>> inputs = tokenizer(instruction, return_tensors="pt").to("cuda")
>>> outputs = model.generate(**inputs, max_new_tokens=640)
>>> response = tokenizer.decode(outputs[0], skip_special_tokens=True)
>>> response = response[response.find("[/INST]") + len("[/INST]"):].strip()
>>> print(response)
"did(O) george(B-actor) clooney(I-actor) make(O) a(O) musical(B-genre) in(O) the(O) 1980s(B-year)"

📄 許可證

本項目採用 Apache - 2.0 許可證。

📚 詳細文檔

預訓練模型

我們發佈了基於不同規模的預訓練模型，具體信息如下：

模型	參數數量	零樣本平均 $F_1$ 分數	有監督平均 $F_1$ 分數	🤗 HuggingFace 下載鏈接
GNER - LLaMA	7B	66.1	86.09	[鏈接](https://huggingface.co/dyyyyyyyy/GNER - LLaMA - 7B)
GNER - T5 - base	248M	59.5	83.21	[鏈接](https://huggingface.co/dyyyyyyyy/GNER - T5 - base)
GNER - T5 - large	783M	63.5	85.45	[鏈接](https://huggingface.co/dyyyyyyyy/GNER - T5 - large)
GNER - T5 - xl	3B	66.1	85.94	[鏈接](https://huggingface.co/dyyyyyyyy/GNER - T5 - xl)
GNER - T5 - xxl	11B	69.1	86.15	[鏈接](https://huggingface.co/dyyyyyyyy/GNER - T5 - xxl)

📚 引用

如果你使用了本項目的代碼或模型，請引用以下論文：

@misc{ding2024rethinking,
      title={Rethinking Negative Instances for Generative Named Entity Recognition}, 
      author={Yuyang Ding and Juntao Li and Pinzheng Wang and Zecheng Tang and Bowen Yan and Min Zhang},
      year={2024},
      eprint={2402.16602},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}