gliner - biomed - base - v1.0開源生物醫學命名實體識別模型，精準識別多種實體類型

首頁

Gliner Biomed Base V1.0

由Ihor開發

GLiNER-生物醫學版是基於GLiNER框架開發的專用生物醫學命名實體識別模型，能夠識別多種生物醫學實體類型。

序列標註

PyTorch

英語開源協議:Apache-2.0 #生物醫學NER #零樣本識別 #多實體類型

下載量 61

發布時間 : 2/19/2025

模型概述

該模型利用從生成式生物醫學大模型提煉的合成標註數據進行訓練，在生物醫學實體識別任務中實現了零樣本和小樣本的先進性能。

模型特點

生物醫學專用

專門針對生物醫學領域優化的命名實體識別模型

零樣本/小樣本學習

在少量或沒有標註數據的情況下仍能保持良好性能

多類型實體識別

能夠同時識別多種生物醫學實體類型

高效推理

相比大語言模型，資源消耗更低，推理速度更快

模型能力

生物醫學實體識別

多類型實體檢測

零樣本學習

小樣本學習

使用案例

醫療文本分析

電子病歷實體提取

從電子病歷中提取疾病、藥物、化驗結果等關鍵信息

準確識別多種醫療實體類型

醫學文獻信息抽取

從醫學研究文獻中提取關鍵實體信息

支持多種生物醫學實體類型識別

臨床決策支持

醫囑信息提取

從醫囑文本中提取藥物、劑量、頻率等信息

準確識別藥物相關實體

🚀 GLiNER-BioMed

GLiNER-BioMed 是一套專門用於生物醫學領域的高效命名實體識別（NER）模型。它基於 GLiNER 框架，藉助從大型生成式生物醫學語言模型中提煉的合成註釋，在生物醫學實體識別任務中實現了零樣本和少樣本學習的先進性能，為傳統 NER 模型和大語言模型提供了實用的替代方案。

🚀 快速開始

安裝

安裝官方的 GLiNER 庫：

pip install gliner -U

使用

安裝 GLiNER 庫後，你可以輕鬆加載 GLiNER-biomed 模型並進行命名實體識別：

from gliner import GLiNER

model = GLiNER.from_pretrained("Ihor/gliner-biomed-base-v1.0")

text = """
The patient, a 45-year-old male, was diagnosed with type 2 diabetes mellitus and hypertension.
He was prescribed Metformin 500mg twice daily and Lisinopril 10mg once daily. 
A recent lab test showed elevated HbA1c levels at 8.2%.
"""

labels = ["Disease", "Drug", "Drug dosage", "Drug frequency", "Lab test", "Lab test value", "Demographic information"]

entities = model.predict_entities(text, labels, threshold=0.5)

for entity in entities:
    print(entity["text"], "=>", entity["label"])

預期輸出：

45-year-old male => Demographic information
type 2 diabetes mellitus => Disease
hypertension => Disease
Metformin => Drug
500mg => Drug dosage
twice daily => Drug frequency
Lisinopril => Drug
10mg => Drug dosage
once daily => Drug frequency
HbA1c levels => Lab test
8.2% => Lab test value

✨ 主要特性

廣泛的實體識別能力：GLiNER 是一種命名實體識別（NER）模型，能夠使用雙向變壓器編碼器（類似 BERT）識別任何實體類型。
高效的生物醫學模型：GLiNER-biomed 與日內瓦大學的 DS4DH 合作開發，基於 GLiNER 框架引入了一套專門的高效開放生物醫學 NER 模型。
零樣本和少樣本學習：利用從大型生成式生物醫學語言模型中提煉的合成註釋，在生物醫學實體識別任務中實現了零樣本和少樣本學習的先進性能。

📦 安裝指南

使用 pip 安裝官方的 GLiNER 庫：

pip install gliner -U

💻 使用示例

基礎用法

from gliner import GLiNER

model = GLiNER.from_pretrained("Ihor/gliner-biomed-base-v1.0")

text = """
The patient, a 45-year-old male, was diagnosed with type 2 diabetes mellitus and hypertension.
He was prescribed Metformin 500mg twice daily and Lisinopril 10mg once daily. 
A recent lab test showed elevated HbA1c levels at 8.2%.
"""

labels = ["Disease", "Drug", "Drug dosage", "Drug frequency", "Lab test", "Lab test value", "Demographic information"]

entities = model.predict_entities(text, labels, threshold=0.5)

for entity in entities:
    print(entity["text"], "=>", entity["label"])

📚 詳細文檔

模型信息

屬性	詳情
基礎模型	microsoft/deberta-v3-base
數據集	knowledgator/GLINER-multi-task-synthetic-data、knowledgator/biomed_NER
語言	英文
庫名稱	gliner
許可證	apache-2.0
評估指標	f1
任務類型	標記分類
標籤	NER、GLiNER、信息提取、編碼器、實體識別、生物醫學

基準測試

我們在 8 個複雜的真實世界數據集上對模型進行了測試，並與其他 GLiNER 模型進行了比較：

模型	F1 分數	宏平均 F1	宏中位數 F1	加權 F1
大型模型
NuNER Zero	40.87	21.79	13.94	33.67
NuNER Zero span	40.26	22.51	14.27	32.52
GLiNER bio v0.1	42.34	27.10	24.44	38.32
GLiNER bio v0.2	38.66	25.36	17.02	32.42
GLiNER v1.0	47.77	29.60	21.13	40.78
GLiNER v2.0	37.38	21.42	15.44	33.11
GLiNER v2.1	48.04	29.75	28.20	43.43
GLiNER news v2.1	48.99	31.79	33.77	45.13
GLiNER v2.5	53.81	35.22	35.65	51.57
GLiNER-biomed	59.77	40.67	42.65	58.40
GLiNER-biomed-bi	54.90	35.78	31.66	50.46
基礎模型
GLiNER v1.0	41.61	24.98	10.27	31.59
GLiNER v2.0	34.33	24.48	22.01	30.58
GLiNER v2.1	40.25	25.26	14.41	32.64
GLiNER news v2.1	41.59	27.16	17.74	34.44
GLiNER v2.5	46.49	30.93	25.26	44.68
GLiNER-biomed	54.37	36.20	41.61	53.05
GLiNER-biomed-bi	58.31	35.22	32.39	54.91
小型模型
GLiNER v1.0	40.99	22.81	7.86	31.15
GLiNER v2.0	33.55	21.12	15.76	28.78
GLiNER v2.1	38.45	23.25	10.92	30.67
GLiNER news v2.1	39.15	24.96	14.48	33.10
GLiNER v2.5	38.21	28.53	18.01	36.88
GLiNER-biomed	52.53	34.49	38.17	50.87
GLiNER-biomed-bi	56.93	33.88	33.61	53.12

加入我們的 Discord

在 Discord 上與我們的社區聯繫，獲取有關我們模型的最新消息、支持和討論。加入 Discord。

📄 許可證

本項目採用 apache-2.0 許可證。

🔗 引用

本工作

如果在你的工作中使用了 GLiNER-biomed 模型，請引用：

@misc{yazdani2025glinerbiomedsuiteefficientmodels,
      title={GLiNER-biomed: A Suite of Efficient Models for Open Biomedical Named Entity Recognition},
      author={Anthony Yazdani and Ihor Stepanov and Douglas Teodoro},
      year={2025},
      eprint={2504.00676},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2504.00676},
}

先前工作

@misc{zaratiana2023gliner,
      title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer},
      author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois},
      year={2023},
      eprint={2311.08526},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

@misc{stepanov2024gliner,
      title={GLiNER multi-task: Generalist Lightweight Model for Various Information Extraction Tasks},
      author={Ihor Stepanov and Mykhailo Shtopko},
      year={2024},
      eprint={2406.12925},
      archivePrefix={arXiv},
      primaryClass={id='cs.LG' full_name='Machine Learning' is_active=True alt_name=None in_archive='cs' is_general=False description='Papers on all aspects of machine learning research (supervised, unsupervised, reinforcement learning, bandit problems, and so on) including also robustness, explanation, fairness, and methodology. cs.LG is also an appropriate primary category for applications of machine learning methods.'}
}