BioMedGPT-LM-7B Open-source Model - Free Support for Biomedical Text Generation and Question-answering Tasks

Biomedgpt LM 7B

Developed by PharMolix

BioMedGPT-LM-7B is the first large-scale generative language model in the biomedical field based on Llama2, specializing in biomedical text generation and Q&A tasks.

Large Language Model

Transformers

Open Source License:Apache-2.0 #Biomedical Q&A #Multimodal Generation #Llama2 Fine-tuning

Downloads 485

Release Time : 8/11/2023

Model Overview

This model was fine-tuned from Llama2-7B-Chat using millions of biomedical papers from the S2ORC corpus, demonstrating excellent performance across multiple biomedical Q&A benchmarks.

Model Features

Specialized for Biomedical Field

Fine-tuned on over 26 billion highly relevant biomedical tokens, optimizing comprehension and generation of biomedical texts.

Multimodal Capabilities

As part of BioMedGPT-10B, it can bridge natural language modalities with diverse biomedical data modalities.

High Performance

Outperforms or matches human experts and larger general-purpose foundation models across multiple biomedical Q&A benchmarks.

Model Capabilities

Biomedical text generation

Biomedical Q&A

Biomedical literature comprehension

Multimodal biomedical data processing

Use Cases

Medical Research

Medical Literature Q&A

Answering complex questions based on biomedical literature

Outperforms human experts in benchmark tests

Medical Report Generation

Automatically generating research reports from medical data

Clinical Assistance

Medical Knowledge Retrieval

Rapid retrieval and summarization of medical knowledge

🚀 BioMedGPT-LM-7B

BioMedGPT-LM-7B is the first large generative language model based on Llama2 in the biomedical domain. It addresses the need for specialized language models in biomedicine by leveraging millions of biomedical papers from the S2ORC corpus. Through fine - tuning, it performs excellently on several biomedical QA benchmarks, either outperforming or matching human and much larger general - purpose foundation models.

✨ Features

Specialized in Biomedicine: Fine - tuned on a vast amount of biomedical data, making it highly relevant to the field.
High - performance on Benchmarks: Outperforms or is on par with human and larger general - purpose models in biomedical QA.

📦 Installation

No installation steps are provided in the original document, so this section is skipped.

💻 Usage Examples

No code examples are provided in the original document, so this section is skipped.

📚 Documentation

Training Details

The model was trained with the following hyperparameters:

Epochs: 5
Batch size: 192
Context length: 2048
Learning rate: 2e - 5

BioMedGPT-LM-7B is fine - tuned on over 26 billion tokens highly relevant to the field of biomedicine. The fine - tuning data are extracted from millions of biomedical papers in S2ORC data using PubMed Central (PMC) - ID and PubMed ID as criteria.

Model Developers

PharMolix

How to Use

BioMedGPT-LM-7B is the generative language model of BioMedGPT-10B, an open - source version of BioMedGPT. BioMedGPT is an open multimodal generative pre - trained transformer (GPT) for biomedicine, which bridges the natural language modality and diverse biomedical data modalities via large generative language models.

The architecture of BioMedGPT-10B

Technical Report

More technical details of BioMedGPT-LM-7B, BioMedGPT-10B, and BioMedGPT can be found in the technical report: "BioMedGPT: Open Multimodal Generative Pre - trained Transformer for BioMedicine".

GitHub

https://github.com/PharMolix/OpenBioMed

Limitations

This repository holds BioMedGPT-LM-7B, and we emphasize the responsible and ethical use of this model. BioMedGPT-LM-7B should NOT be used to provide services to the general public. Generating any content that violates applicable laws and regulations, such as inciting subversion of state power, endangering national security and interests, propagating terrorism, extremism, ethnic hatred and discrimination, violence, pornography, or false and harmful information, etc. is strictly prohibited. BioMedGPT-LM-7B is not liable for any consequences arising from any content, data, or information provided or published by users.

🔧 Technical Details

The model's training hyperparameters and data sources are described in the "Training Details" section. It was fine - tuned on a large amount of biomedical data, which gives it an edge in the biomedical domain.

📄 License

This repository is licensed under the Apache - 2.0. The use of BioMedGPT-LM-7B model is accompanied with Acceptable Use Policy.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご