BioM-ELECTRA-Large-SQuAD2-BioASQ8B开源模型 - 为生物医学领域提供精准问答支持

首页

Biom ELECTRA Large SQuAD2 BioASQ8B

由 sultan 开发

基于ELECTRA架构优化的生物医学领域问答模型，在BioASQ8B数据集上微调后取得SOTA性能

大型语言模型

Transformers

#生物医学问答 #ELECTRA优化 #SQuAD2.0迁移学习

下载量 50

发布时间 : 3/2/2022

模型简介

专为生物医学领域设计的问题回答模型，支持直接从PubMed摘要中提取答案，适用于构建疫情问答系统等应用场景

模型特点

领域适配优化

通过BioASQ8B数据集二次微调，显著提升生物医学领域问答准确率

计算效率优势

在计算成本相近或更低情况下，性能超越原始ELECTRA大型版

即用型接口

支持直接输入PubMed摘要进行问答推理，无需额外微调

模型能力

生物医学文献问答

事实型问题解答

COVID-19相关咨询

专业术语理解

使用案例

医疗信息系统

疫情问答系统

构建COVID-19等传染病相关知识的自动问答服务

可准确回答基于最新研究文献的流行病学问题

学术研究辅助

文献快速检索

从大量生物医学文献中快速定位关键信息

在BioASQ8B测试集上达到74.31%的精确匹配率

🚀 BioM-Transformers：使用BERT、ALBERT和ELECTRA构建大型生物医学语言模型

BioM-Transformers旨在研究不同设计选择对生物医学语言模型性能的影响，并通过实验证明其在多个生物医学领域任务中能以相似或更低的计算成本取得最先进的结果。

🚀 快速开始

模型训练

使用以下脚本对模型进行训练：

python3 run_squad.py --model_type electra --model_name_or_path sultan/BioM-ELECTRA-Large-SQuAD2 \
--train_file BioASQ8B/train.json \
--predict_file BioASQ8B/dev.json \
--do_lower_case \
--do_train \
--do_eval \
--threads 20 \
--version_2_with_negative \
--num_train_epochs 3 \
--learning_rate 5e-5 \
--max_seq_length 512 \
--doc_stride 128 \
--per_gpu_train_batch_size 8 \
--gradient_accumulation_steps 2  \
--per_gpu_eval_batch_size 128   \
--logging_steps 50 \
--save_steps 5000 \
--fp16 \
--fp16_opt_level O1 \
--overwrite_output_dir \
--output_dir BioM-ELECTRA-Large-SQuAD-BioASQ \
--overwrite_cache

✨ 主要特性

微调优化：该模型先在SQuAD2.0数据集上进行微调，然后在BioASQ8B-Factoid训练数据集上微调。我们将BioASQ8B-Factoid训练数据集转换为SQuAD1.1格式，并在此数据集上对模型（BioM-ELECTRA-Base-SQuAD2）进行训练和评估。
直接推理：可以直接使用该模型进行预测（推理），无需进一步微调。你可以在模型卡片的上下文框中输入一篇PubMed摘要，并尝试提出一些给定上下文中的生物医学问题，观察其与原始ELECTRA模型的性能对比。该模型对于创建大流行问答系统（如COVID-19问答系统）也很有用。
版本差异：请注意，此版本（PyTorch）与我们参加BioASQ9B时使用的版本（带有逐层衰减的TensorFlow）不同。我们将BioASQ8B测试数据集的所有五批数据合并为一个dev.json文件。

📚 详细文档

模型性能对比

以下是我们的模型与原始ELECTRA基础版和大版本的非官方对比结果：

模型	精确匹配率 (EM)	F1分数
ELECTRA-Base-SQuAD2-BioASQ8B	61.89	74.39
BioM-ELECTRA-Base-SQuAD2-BioASQ8B	70.31	80.90
ELECTRA-Large-SQuAD2-BioASQ8B	67.36	78.90
BioM-ELECTRA-Large-SQuAD2-BioASQ8B	74.31	84.72

📄 致谢

我们感谢Tensorflow研究云（TFRC）团队为我们提供TPUv3单元的访问权限。

📄 引用

@inproceedings{alrowili-shanker-2021-biom,
title = "{B}io{M}-Transformers: Building Large Biomedical Language Models with {BERT}, {ALBERT} and {ELECTRA}",
author = "Alrowili, Sultan and
Shanker, Vijay",
booktitle = "Proceedings of the 20th Workshop on Biomedical Language Processing",
month = jun,
year = "2021",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://www.aclweb.org/anthology/2021.bionlp-1.24",
pages = "221--227",
abstract = "The impact of design choices on the performance of biomedical language models recently has been a subject for investigation. In this paper, we empirically study biomedical domain adaptation with large transformer models using different design choices. We evaluate the performance of our pretrained models against other existing biomedical language models in the literature. Our results show that we achieve state-of-the-art results on several biomedical domain tasks despite using similar or less computational cost compared to other models in the literature. Our findings highlight the significant effect of design choices on improving the performance of biomedical language models.",
}