BioGPT-Large Open Source Model - Free Assistance for Biomedical Text Generation and Mining

Biogpt Large

Developed by microsoft

BioGPT is a domain-specific generative pre-trained Transformer language model trained on large-scale biomedical literature, focusing on biomedical text generation and mining.

Large Language Model

Transformers

EnglishOpen Source License:MIT #Biomedical Text Generation #PubMed Pretraining #End-to-End Relation Extraction

Downloads 7,869

Release Time : 2/3/2023

Model Overview

BioGPT is a generative pre-trained Transformer model specifically designed for the biomedical domain. It excels in biomedical natural language processing tasks, capable of generating fluent biomedical text descriptions and setting new records on multiple tasks.

Model Features

Biomedical Domain-Specific

Pre-trained specifically on biomedical literature, with domain-specific knowledge representation capabilities.

Strong Generation Capability

Capable of generating fluent and accurate biomedical text descriptions, addressing the limitations of traditional discriminative models.

Outstanding Multi-Task Performance

Achieved new records on multiple biomedical NLP tasks, including relation extraction and question answering.

Model Capabilities

Biomedical Text Generation

End-to-End Relation Extraction

Biomedical Question Answering

Biomedical Terminology Description Generation

Use Cases

Biomedical Research

Drug-Disease Relation Extraction

Automatically extract relationships between drugs and diseases from literature

Achieved 44.98% F1 score on BC5CDR task

Biomedical Question Answering

Answer biomedical questions based on PubMed literature

Achieved 78.2% accuracy on PubMedQA

Clinical Decision Support

Drug Interaction Prediction

Predict potential interactions between drugs

Achieved 40.76% F1 score on DDI task

🚀 BioGPT

BioGPT is a domain - specific generative Transformer language model pre - trained on large - scale biomedical literature, which shows excellent performance in various biomedical natural language processing tasks.

🚀 Quick Start

This section will be updated if there are any start - up instructions in the future.

✨ Features

Pre - trained language models have gained increasing attention in the biomedical domain, thanks to their remarkable success in the general natural language domain. In the general language domain, there are two main branches of pre - trained language models: BERT (and its variants) and GPT (and its variants). The former has been extensively studied in the biomedical domain, like BioBERT and PubMedBERT. Although they have achieved great success in a variety of discriminative downstream biomedical tasks, their lack of generation ability limits their application scope.

BioGPT, a domain - specific generative Transformer language model, is pre - trained on large - scale biomedical literature. It is evaluated on six biomedical natural language processing tasks and outperforms previous models on most tasks. Specifically, it achieves F1 scores of 44.98%, 38.42%, and 40.76% on BC5CDR, KD - DTI, and DDI end - to - end relation extraction tasks respectively, and an accuracy of 78.2% on PubMedQA, setting a new record. A case study on text generation further demonstrates BioGPT's advantage in generating fluent descriptions for biomedical terms from biomedical literature.

📄 License

This project is licensed under the MIT license.

📚 Documentation

Dataset

Pubmed

Library

transformers

Pipeline Tag

text - generation

Inference Parameters

Property	Details
max_new_tokens	50

Widget

The widget takes the text "COVID - 19 is" as an input example.

📚 Citation

If you find BioGPT useful in your research, please cite the following paper:

@article{10.1093/bib/bbac409,
    author = {Luo, Renqian and Sun, Liai and Xia, Yingce and Qin, Tao and Zhang, Sheng and Poon, Hoifung and Liu, Tie - Yan},
    title = "{BioGPT: generative pre - trained transformer for biomedical text generation and mining}",
    journal = {Briefings in Bioinformatics},
    volume = {23},
    number = {6},
    year = {2022},
    month = {09},
    abstract = "{Pre - trained language models have attracted increasing attention in the biomedical domain, inspired by their great success in the general natural language domain. Among the two main branches of pre - trained language models in the general language domain, i.e. BERT (and its variants) and GPT (and its variants), the first one has been extensively studied in the biomedical domain, such as BioBERT and PubMedBERT. While they have achieved great success on a variety of discriminative downstream biomedical tasks, the lack of generation ability constrains their application scope. In this paper, we propose BioGPT, a domain - specific generative Transformer language model pre - trained on large - scale biomedical literature. We evaluate BioGPT on six biomedical natural language processing tasks and demonstrate that our model outperforms previous models on most tasks. Especially, we get 44.98\%, 38.42\% and 40.76\% F1 score on BC5CDR, KD - DTI and DDI end - to - end relation extraction tasks, respectively, and 78.2\% accuracy on PubMedQA, creating a new record. Our case study on text generation further demonstrates the advantage of BioGPT on biomedical literature to generate fluent descriptions for biomedical terms.}",
    issn = {1477 - 4054},
    doi = {10.1093/bib/bbac409},
    url = {https://doi.org/10.1093/bib/bbac409},
    note = {bbac409},
    eprint = {https://academic.oup.com/bib/article - pdf/23/6/bbac409/47144271/bbac409.pdf},
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご