GNER-T5-xxl開源命名實體識別模型 - 零樣本識別表現優，免費可用！

首頁

GNER T5 Xxl

由dyyyyyyyy開發

GNER-T5-xxl是基於Flan-T5架構的生成式命名實體識別模型，參數量達11B，在零樣本識別任務中表現優異。

序列標註

Transformers

英語開源協議:Apache-2.0 #生成式NER #零樣本學習 #多領域實體識別

下載量 51

發布時間 : 2/27/2024

模型概述

該模型採用生成式方法進行命名實體識別，特別擅長處理未見過的實體領域，通過引入負實例訓練顯著提升性能。

模型特點

零樣本識別能力

在未見過的實體領域展現出強大的零樣本識別能力

負實例訓練

通過將負實例納入訓練過程帶來顯著性能提升

多尺寸選擇

提供從base到xxl多種參數規模的模型選擇

模型能力

命名實體識別

零樣本實體識別

文本生成

使用案例

信息提取

影視領域實體識別

識別影視作品中的演員、導演、年份等實體

在測試數據上F1值達69.1

跨領域實體識別

處理未見領域的新實體類型識別

零樣本性能超越當前最優方案8-11分

🚀 重新思考生成式命名實體識別中的負樣本

本項目提出了生成式命名實體識別（GNER）框架，該框架在未見實體領域展現出了強大的零樣本能力。通過在兩個代表性生成模型（LLaMA和Flan - T5）上的實驗表明，在訓練過程中引入負樣本能夠顯著提升模型性能。由此得到的GNER - LLaMA和GNER - T5模型大幅超越了現有最優方法，$F_1$分數分別提高了8分和11分。代碼和模型均已公開。

🚀 快速開始

安裝依賴

你需要安裝以下依賴：

pip install torch datasets deepspeed accelerate transformers protobuf

使用指南

請參考示例Jupyter筆記本以瞭解如何使用GNER模型。

✨ 主要特性

強大的零樣本能力：GNER框架在未見實體領域展現出了出色的零樣本性能。
性能顯著提升：在訓練過程中引入負樣本，使得模型性能大幅超越現有最優方法。
多模型發佈：基於LLaMA (7B) 和Flan - T5 (base, large, xl和xxl) 發佈了五個GNER模型。

📦 安裝指南

安裝所需依賴：

pip install torch datasets deepspeed accelerate transformers protobuf

💻 使用示例

基礎用法

以下是使用GNER - T5的簡單推理示例：

>>> import torch
>>> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
>>> tokenizer = AutoTokenizer.from_pretrained("dyyyyyyyy/GNER-T5-xxl")
>>> model = AutoModelForSeq2SeqLM.from_pretrained("dyyyyyyyy/GNER-T5-xxl", torch_dtype=torch.bfloat16).cuda()
>>> model = model.eval()
>>> instruction_template = "Please analyze the sentence provided, identifying the type of entity for each word on a token-by-token basis.\nOutput format is: word_1(label_1), word_2(label_2), ...\nWe'll use the BIO-format to label the entities, where:\n1. B- (Begin) indicates the start of a named entity.\n2. I- (Inside) is used for words within a named entity but are not the first word.\n3. O (Outside) denotes words that are not part of a named entity.\n"
>>> sentence = "did george clooney make a musical in the 1980s"
>>> entity_labels = ["genre", "rating", "review", "plot", "song", "average ratings", "director", "character", "trailer", "year", "actor", "title"]
>>> instruction = f"{instruction_template}\nUse the specific entity tags: {', '.join(entity_labels)} and O.\nSentence: {sentence}"
>>> inputs = tokenizer(instruction, return_tensors="pt").to("cuda")
>>> outputs = model.generate(**inputs, max_new_tokens=640)
>>> response = tokenizer.decode(outputs[0], skip_special_tokens=True)
>>> print(response)
"did(O) george(B-actor) clooney(I-actor) make(O) a(O) musical(B-genre) in(O) the(O) 1980s(B-year)"

📚 詳細文檔

預訓練模型

我們基於LLaMA (7B) 和Flan - T5 (base, large, xl和xxl) 發佈了五個GNER模型。

屬性	詳情
模型類型	GNER - LLaMA、GNER - T5 - base、GNER - T5 - large、GNER - T5 - xl、GNER - T5 - xxl
訓練數據	Universal - NER/Pile - NER - type

模型	參數數量	零樣本平均$F_1$分數	有監督平均$F_1$分數	🤗 HuggingFace下載鏈接
GNER - LLaMA	7B	66.1	86.09	鏈接
GNER - T5 - base	248M	59.5	83.21	鏈接
GNER - T5 - large	783M	63.5	85.45	鏈接
GNER - T5 - xl	3B	66.1	85.94	鏈接
GNER - T5 - xxl	11B	69.1	86.15	鏈接

引用

@misc{ding2024rethinking,
      title={Rethinking Negative Instances for Generative Named Entity Recognition}, 
      author={Yuyang Ding and Juntao Li and Pinzheng Wang and Zecheng Tang and Bowen Yan and Min Zhang},
      year={2024},
      eprint={2402.16602},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}