🚀 BioELECTRA - PICO
BioELECTRA - PICO 是一個針對生物醫學領域的預訓練文本編碼器模型,它採用了 ELECTRA 的 “替換詞檢測” 預訓練技術,在多個生物醫學 NLP 基準測試中表現出色,為生物醫學文本挖掘任務提供了強大的支持。
🚀 快速開始
引用說明
如果您使用了我們的研究成果,請使用以下 BibTeX 格式引用我們的論文:
@inproceedings{kanakarajan-etal-2021-bioelectra,
title = "{B}io{ELECTRA}:Pretrained Biomedical text Encoder using Discriminators",
author = "Kanakarajan, Kamal raj and
Kundumani, Bhuvana and
Sankarasubbu, Malaikannan",
booktitle = "Proceedings of the 20th Workshop on Biomedical Language Processing",
month = jun,
year = "2021",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2021.bionlp-1.16",
doi = "10.18653/v1/2021.bionlp-1.16",
pages = "143--154",
abstract = "Recent advancements in pretraining strategies in NLP have shown a significant improvement in the performance of models on various text mining tasks. We apply {`}replaced token detection{'} pretraining technique proposed by ELECTRA and pretrain a biomedical language model from scratch using biomedical text and vocabulary. We introduce BioELECTRA, a biomedical domain-specific language encoder model that adapts ELECTRA for the Biomedical domain. WE evaluate our model on the BLURB and BLUE biomedical NLP benchmarks. BioELECTRA outperforms the previous models and achieves state of the art (SOTA) on all the 13 datasets in BLURB benchmark and on all the 4 Clinical datasets from BLUE Benchmark across 7 different NLP tasks. BioELECTRA pretrained on PubMed and PMC full text articles performs very well on Clinical datasets as well. BioELECTRA achieves new SOTA 86.34{\%}(1.39{\%} accuracy improvement) on MedNLI and 64{\%} (2.98{\%} accuracy improvement) on PubMedQA dataset.",
}
示例信息
在相關研究中發現:與安慰劑組相比,阿司匹林組的頭痛持續時間有所縮短(P<0.05)。
widget:
- text: "Those in the aspirin group experienced reduced duration of headache compared to those in the placebo arm (P<0.05)"