gliner-model-merge-large-v1.0开源命名实体识别模型

首页

Gliner Model Merge Large V1.0

由 xomad 开发

基于模型融合技术优化的命名实体识别模型，F1分数提升3.25个点至0.6601

序列标注

PyTorch

英语开源协议:Apache-2.0 #多任务NER #模型融合优化 #零样本学习

下载量 129

发布时间 : 9/24/2024

模型简介

该模型是基于GLiNER架构的命名实体识别模型，通过创新的模型融合技术显著提升性能。支持零样本NER任务，可识别文本中的多种实体类型。

模型特点

模型融合技术

采用WiSE-FT等先进模型融合方法，显著提升性能3.25个F1点

商业友好许可

仅在具有商业友好许可的数据集上训练，确保广泛适用性

多数据集训练

融合5个高质量数据集的知识，增强模型泛化能力

零样本能力

支持零样本命名实体识别，无需特定领域训练数据

模型能力

命名实体识别

零样本学习

多类别实体检测

文本分析

使用案例

新闻分析

新闻人物与组织识别

从新闻文本中自动识别人物、组织、地点等实体

在政治领域F1达78.51%

商业智能

企业信息提取

从商业文档中提取公司、创始人、产品等信息

示例中准确识别微软公司和其创始人

学术研究

科学文献分析

识别科研论文中的专业术语和概念

科学领域F1达72.41%

🚀 xomad/gliner-model-merge-large-v1.0模型

xomad/gliner-model-merge-large-v1.0 模型基于预训练模型 knowledgator/gliner-multitask-large-v0.5 开发，旨在探索模型融合技术的能力。通过该技术，模型性能显著提升了3.25个百分点，F1分数从0.6276提高到了0.6601。

该模型仅在具有商业友好许可的数据集上进行训练，以确保在Apache - 2.0许可下具有广泛的适用性。训练过程中使用了以下数据集：

🚀 快速开始

本模型基于预训练模型 knowledgator/gliner-multitask-large-v0.5 开发，通过模型融合技术显著提升了性能。下面将介绍如何安装和使用该模型。

✨ 主要特性

性能提升：通过模型融合技术，F1分数显著提高，从0.6276提升到0.6601。
商业友好：仅在具有商业友好许可的数据集上训练，适用于Apache - 2.0许可。
多数据集训练：使用多个公开数据集进行训练，包括 knowledgator/GLINER-multi-task-synthetic-data、EmergentMethods/AskNews-NER-v0 等。

📦 安装指南

要使用此模型，你必须安装 GLiNER Python库：

pip install gliner

下载GLiNER库后，你可以导入GLiNER类，然后使用 GLiNER.from_pretrained 加载此模型。

💻 使用示例

基础用法

from gliner import GLiNER

model = GLiNER.from_pretrained("xomad/gliner-model-merge-large-v1.0")

text = """
Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975 to develop and sell BASIC interpreters for the Altair 8800. During his career at Microsoft, Gates held the positions of chairman, chief executive officer, president and chief software architect, while also being the largest individual shareholder until May 2014.
"""

labels = ["founder", "computer", "software", "position", "date", "company"]

entities = model.predict_entities(text, labels)

for entity in entities:
    print(entity["text"], "=>", entity["label"])

输出：

Microsoft => company
Bill Gates => founder
Paul Allen => founder
April 4, 1975 => date
BASIC => software
Altair 8800 => computer
Microsoft => company
chairman => position
chief executive officer => position
president => position
chief software architect => position
May 2014 => date

📚 详细文档

⚙️ 微调过程

该过程从基础模型 knowledgator/gliner-multitask-large-v0.5 开始。我们的模型 xomad/gliner-model-merge-large-v1.0 在上述每个数据集上分别进行微调，并在微调过程中保存多个检查点。我们将所有这些检查点放入一个池中，然后应用 Model soups 技术来生成不同的融合模型：

uniform_merged
greedy_on_random
greedy_on_sorted

随后，我们对从上述3个模型和原始模型中选出的模型对应用 WiSE - FT 融合技术，生成 wise_ft_merged 模型。这结束了第一阶段的微调。

在第二阶段的微调中，以 wise_ft_merged 作为新的起点重复该过程，以生成最终模型。整个微调流程如下图所示：

Finetuning flow

微调模型池和融合模型的性能在 CrossNER、TwitterNER基准测试中进行评估，并绘制在以下两个图中（分别为 crossner_f1 和 other_f1）。

第一阶段微调图： 1st finetuning phase

第二阶段微调图： 2nd finetuning phase

📊 基准测试

Model Performance

不同零样本NER基准测试（CrossNER、mit - movie和mit - restaurant）的性能，数据来源于 https://huggingface.co/knowledgator/gliner-multitask-large-v0.5：

模型	F1分数
xomad/gliner-model-merge-large-v1.0	0.6601
knowledgator/gliner-multitask-v0.5	0.6276
numind/NuNER_Zero-span	0.6196
gliner-community/gliner_large-v2.5	0.615
EmergentMethods/gliner_large_news-v2.1	0.5876
urchade/gliner_large-v2.1	0.5754

不同数据集上的详细性能：

模型	数据集	精确率	召回率	F1分数	F1分数（小数）
xomad/gliner-model-merge-large-v1.0	CrossNER_AI	62.66%	57.48%	59.96%	0.5996
	CrossNER_literature	73.28%	66.42%	69.68%	0.6968
	CrossNER_music	74.89%	70.67%	72.72%	0.7272
	CrossNER_politics	79.46%	77.57%	78.51%	0.7851
	CrossNER_science	74.72%	70.24%	72.41%	0.7241
	mit-movie	67.33%	57.89%	62.25%	0.6225
	mit-restaurant	54.94%	40.41%	46.57%	0.4657
	平均				0.6601
numind/NuNER_Zero-span	CrossNER_AI	63.82%	56.82%	60.12%	0.6012
	CrossNER_literature	73.53%	58.06%	64.89%	0.6489
	CrossNER_music	72.69%	67.40%	69.95%	0.6995
	CrossNER_politics	77.28%	68.69%	72.73%	0.7273
	CrossNER_science	70.08%	63.12%	66.42%	0.6642
	mit-movie	63.00%	48.88%	55.05%	0.5505
	mit-restaurant	54.81%	37.62%	44.62%	0.4462
	平均				0.6196
knowledgator/gliner-multitask-v0.5	CrossNER_AI	51.00%	51.11%	51.05%	0.5105
	CrossNER_literature	72.65%	65.62%	68.96%	0.6896
	CrossNER_music	74.91%	73.70%	74.30%	0.7430
	CrossNER_politics	78.84%	77.71%	78.27%	0.7827
	CrossNER_science	69.20%	65.48%	67.29%	0.6729
	mit-movie	61.29%	52.59%	56.60%	0.5660
	mit-restaurant	50.65%	38.13%	43.51%	0.4351
	平均				0.6276
gliner-community/gliner_large-v2.5	CrossNER_AI	50.85%	63.03%	56.29%	0.5629
	CrossNER_literature	64.92%	67.21%	66.04%	0.6604
	CrossNER_music	70.88%	73.10%	71.97%	0.7197
	CrossNER_politics	72.67%	72.93%	72.80%	0.7280
	CrossNER_science	61.71%	68.85%	65.08%	0.6508
	mit-movie	54.63%	52.83%	53.71%	0.5371
	mit-restaurant	47.99%	42.13%	44.87%	0.4487
	平均				0.6154
urchade/gliner_large-v2.1	CrossNER_AI	54.98%	52.00%	53.45%	0.5345
	CrossNER_literature	59.33%	56.47%	57.87%	0.5787
	CrossNER_music	67.39%	66.77%	67.08%	0.6708
	CrossNER_politics	66.07%	63.76%	64.90%	0.6490
	CrossNER_science	61.45%	62.56%	62.00%	0.6200
	mit-movie	55.94%	47.36%	51.29%	0.5129
	mit-restaurant	53.34%	40.83%	46.25%	0.4625
	平均				0.5754
EmergentMethods/gliner_large_news-v2.1	CrossNER_AI	59.60%	54.55%	56.96%	0.5696
	CrossNER_literature	65.41%	56.16%	60.44%	0.6044
	CrossNER_music	67.47%	63.08%	65.20%	0.6520
	CrossNER_politics	66.05%	60.07%	62.92%	0.6292
	CrossNER_science	68.44%	63.57%	65.92%	0.6592
	mit-movie	65.85%	49.59%	56.57%	0.5657
	mit-restaurant	54.71%	35.94%	43.38%	0.4338
	平均				0.5876

作者

Hoan Nguyen，来自xomad.com

引用

@misc{wortsman2022modelsoupsaveragingweights,
      title={Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time}, 
      author={Mitchell Wortsman and Gabriel Ilharco and Samir Yitzhak Gadre and Rebecca Roelofs and Raphael Gontijo-Lopes and Ari S. Morcos and Hongseok Namkoong and Ali Farhadi and Yair Carmon and Simon Kornblith and Ludwig Schmidt},
      year={2022},
      eprint={2203.05482},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2203.05482}, 
}

@InProceedings{Wortsman_2022_CVPR,
    author    = {Wortsman, Mitchell and Ilharco, Gabriel and Kim, Jong Wook and Li, Mike and Kornblith, Simon and Roelofs, Rebecca and Lopes, Raphael Gontijo and Hajishirzi, Hannaneh and Farhadi, Ali and Namkoong, Hongseok and Schmidt, Ludwig},
    title     = {Robust Fine-Tuning of Zero-Shot Models},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {7959-7971}
}

@misc{stepanov2024gliner,
      title={GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks}, 
      author={Ihor Stepanov and Mykhailo Shtopko},
      year={2024},
      eprint={2406.12925},
      archivePrefix={arXiv},
      primaryClass={id='cs.LG' full_name='Machine Learning' is_active=True alt_name=None in_archive='cs' is_general=False description='Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.'}
}

@misc{zaratiana2023gliner,
      title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer}, 
      author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois},
      year={2023},
      eprint={2311.08526},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}