BioM-ELECTRA-Large-SQuAD2-BioASQ8B開源模型 - 為生物醫學領域提供精準問答支持

首頁

Biom ELECTRA Large SQuAD2 BioASQ8B

由sultan開發

基於ELECTRA架構優化的生物醫學領域問答模型，在BioASQ8B數據集上微調後取得SOTA性能

大型語言模型

Transformers

#生物醫學問答 #ELECTRA優化 #SQuAD2.0遷移學習

下載量 50

發布時間 : 3/2/2022

模型概述

專為生物醫學領域設計的問題回答模型，支持直接從PubMed摘要中提取答案，適用於構建疫情問答系統等應用場景

模型特點

領域適配優化

通過BioASQ8B數據集二次微調，顯著提升生物醫學領域問答準確率

計算效率優勢

在計算成本相近或更低情況下，性能超越原始ELECTRA大型版

即用型接口

支持直接輸入PubMed摘要進行問答推理，無需額外微調

模型能力

生物醫學文獻問答

事實型問題解答

COVID-19相關諮詢

專業術語理解

使用案例

醫療信息系統

疫情問答系統

構建COVID-19等傳染病相關知識的自動問答服務

可準確回答基於最新研究文獻的流行病學問題

學術研究輔助

文獻快速檢索

從大量生物醫學文獻中快速定位關鍵信息

在BioASQ8B測試集上達到74.31%的精確匹配率

🚀 BioM-Transformers：使用BERT、ALBERT和ELECTRA構建大型生物醫學語言模型

BioM-Transformers旨在研究不同設計選擇對生物醫學語言模型性能的影響，並通過實驗證明其在多個生物醫學領域任務中能以相似或更低的計算成本取得最先進的結果。

🚀 快速開始

模型訓練

使用以下腳本對模型進行訓練：

python3 run_squad.py --model_type electra --model_name_or_path sultan/BioM-ELECTRA-Large-SQuAD2 \
--train_file BioASQ8B/train.json \
--predict_file BioASQ8B/dev.json \
--do_lower_case \
--do_train \
--do_eval \
--threads 20 \
--version_2_with_negative \
--num_train_epochs 3 \
--learning_rate 5e-5 \
--max_seq_length 512 \
--doc_stride 128 \
--per_gpu_train_batch_size 8 \
--gradient_accumulation_steps 2  \
--per_gpu_eval_batch_size 128   \
--logging_steps 50 \
--save_steps 5000 \
--fp16 \
--fp16_opt_level O1 \
--overwrite_output_dir \
--output_dir BioM-ELECTRA-Large-SQuAD-BioASQ \
--overwrite_cache

✨ 主要特性

微調優化：該模型先在SQuAD2.0數據集上進行微調，然後在BioASQ8B-Factoid訓練數據集上微調。我們將BioASQ8B-Factoid訓練數據集轉換為SQuAD1.1格式，並在此數據集上對模型（BioM-ELECTRA-Base-SQuAD2）進行訓練和評估。
直接推理：可以直接使用該模型進行預測（推理），無需進一步微調。你可以在模型卡片的上下文框中輸入一篇PubMed摘要，並嘗試提出一些給定上下文中的生物醫學問題，觀察其與原始ELECTRA模型的性能對比。該模型對於創建大流行問答系統（如COVID-19問答系統）也很有用。
版本差異：請注意，此版本（PyTorch）與我們參加BioASQ9B時使用的版本（帶有逐層衰減的TensorFlow）不同。我們將BioASQ8B測試數據集的所有五批數據合併為一個dev.json文件。

📚 詳細文檔

模型性能對比

以下是我們的模型與原始ELECTRA基礎版和大版本的非官方對比結果：

模型	精確匹配率 (EM)	F1分數
ELECTRA-Base-SQuAD2-BioASQ8B	61.89	74.39
BioM-ELECTRA-Base-SQuAD2-BioASQ8B	70.31	80.90
ELECTRA-Large-SQuAD2-BioASQ8B	67.36	78.90
BioM-ELECTRA-Large-SQuAD2-BioASQ8B	74.31	84.72

📄 致謝

我們感謝Tensorflow研究雲（TFRC）團隊為我們提供TPUv3單元的訪問權限。

📄 引用

@inproceedings{alrowili-shanker-2021-biom,
title = "{B}io{M}-Transformers: Building Large Biomedical Language Models with {BERT}, {ALBERT} and {ELECTRA}",
author = "Alrowili, Sultan and
Shanker, Vijay",
booktitle = "Proceedings of the 20th Workshop on Biomedical Language Processing",
month = jun,
year = "2021",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2021.bionlp-1.24",
pages = "221--227",
abstract = "The impact of design choices on the performance of biomedical language models recently has been a subject for investigation. In this paper, we empirically study biomedical domain adaptation with large transformer models using different design choices. We evaluate the performance of our pretrained models against other existing biomedical language models in the literature. Our results show that we achieve state-of-the-art results on several biomedical domain tasks despite using similar or less computational cost compared to other models in the literature. Our findings highlight the significant effect of design choices on improving the performance of biomedical language models.",
}