GNER-T5-xxl开源命名实体识别模型 - 零样本识别表现优，免费可用！

首页

GNER T5 Xxl

由 dyyyyyyyy 开发

GNER-T5-xxl是基于Flan-T5架构的生成式命名实体识别模型，参数量达11B，在零样本识别任务中表现优异。

序列标注

Transformers

英语开源协议:Apache-2.0 #生成式NER #零样本学习 #多领域实体识别

下载量 51

发布时间 : 2/27/2024

模型简介

该模型采用生成式方法进行命名实体识别，特别擅长处理未见过的实体领域，通过引入负实例训练显著提升性能。

模型特点

零样本识别能力

在未见过的实体领域展现出强大的零样本识别能力

负实例训练

通过将负实例纳入训练过程带来显著性能提升

多尺寸选择

提供从base到xxl多种参数规模的模型选择

模型能力

命名实体识别

零样本实体识别

文本生成

使用案例

信息提取

影视领域实体识别

识别影视作品中的演员、导演、年份等实体

在测试数据上F1值达69.1

跨领域实体识别

处理未见领域的新实体类型识别

零样本性能超越当前最优方案8-11分

🚀 重新思考生成式命名实体识别中的负样本

本项目提出了生成式命名实体识别（GNER）框架，该框架在未见实体领域展现出了强大的零样本能力。通过在两个代表性生成模型（LLaMA和Flan - T5）上的实验表明，在训练过程中引入负样本能够显著提升模型性能。由此得到的GNER - LLaMA和GNER - T5模型大幅超越了现有最优方法，$F_1$分数分别提高了8分和11分。代码和模型均已公开。

🚀 快速开始

安装依赖

你需要安装以下依赖：

pip install torch datasets deepspeed accelerate transformers protobuf

使用指南

请参考示例Jupyter笔记本以了解如何使用GNER模型。

✨ 主要特性

强大的零样本能力：GNER框架在未见实体领域展现出了出色的零样本性能。
性能显著提升：在训练过程中引入负样本，使得模型性能大幅超越现有最优方法。
多模型发布：基于LLaMA (7B) 和Flan - T5 (base, large, xl和xxl) 发布了五个GNER模型。

📦 安装指南

安装所需依赖：

pip install torch datasets deepspeed accelerate transformers protobuf

💻 使用示例

基础用法

以下是使用GNER - T5的简单推理示例：

>>> import torch
>>> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
>>> tokenizer = AutoTokenizer.from_pretrained("dyyyyyyyy/GNER-T5-xxl")
>>> model = AutoModelForSeq2SeqLM.from_pretrained("dyyyyyyyy/GNER-T5-xxl", torch_dtype=torch.bfloat16).cuda()
>>> model = model.eval()
>>> instruction_template = "Please analyze the sentence provided, identifying the type of entity for each word on a token-by-token basis.\nOutput format is: word_1(label_1), word_2(label_2), ...\nWe'll use the BIO-format to label the entities, where:\n1. B- (Begin) indicates the start of a named entity.\n2. I- (Inside) is used for words within a named entity but are not the first word.\n3. O (Outside) denotes words that are not part of a named entity.\n"
>>> sentence = "did george clooney make a musical in the 1980s"
>>> entity_labels = ["genre", "rating", "review", "plot", "song", "average ratings", "director", "character", "trailer", "year", "actor", "title"]
>>> instruction = f"{instruction_template}\nUse the specific entity tags: {', '.join(entity_labels)} and O.\nSentence: {sentence}"
>>> inputs = tokenizer(instruction, return_tensors="pt").to("cuda")
>>> outputs = model.generate(**inputs, max_new_tokens=640)
>>> response = tokenizer.decode(outputs[0], skip_special_tokens=True)
>>> print(response)
"did(O) george(B-actor) clooney(I-actor) make(O) a(O) musical(B-genre) in(O) the(O) 1980s(B-year)"

📚 详细文档

预训练模型

我们基于LLaMA (7B) 和Flan - T5 (base, large, xl和xxl) 发布了五个GNER模型。

属性	详情
模型类型	GNER - LLaMA、GNER - T5 - base、GNER - T5 - large、GNER - T5 - xl、GNER - T5 - xxl
训练数据	Universal - NER/Pile - NER - type

模型	参数数量	零样本平均$F_1$分数	有监督平均$F_1$分数	🤗 HuggingFace下载链接
GNER - LLaMA	7B	66.1	86.09	链接
GNER - T5 - base	248M	59.5	83.21	链接
GNER - T5 - large	783M	63.5	85.45	链接
GNER - T5 - xl	3B	66.1	85.94	链接
GNER - T5 - xxl	11B	69.1	86.15	链接

引用

@misc{ding2024rethinking,
      title={Rethinking Negative Instances for Generative Named Entity Recognition}, 
      author={Yuyang Ding and Juntao Li and Pinzheng Wang and Zecheng Tang and Bowen Yan and Min Zhang},
      year={2024},
      eprint={2402.16602},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}