BioMistral-7B-DARE Open-Source Medical Large Model - Free to Use and Empower Biomedical Text Generation

Biomistral 7B DARE

Developed by BioMistral

BioMistral-7B-mistral7instruct-dare is a large language model for the medical domain, merged using the DARE method from Mistral-7B-Instruct-v0.1 and BioMistral-7B, specializing in biomedical text generation tasks.

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Apache-2.0 #Medical Q&A #Multilingual Medicine #DARE Fusion

Downloads 426

Release Time : 2/5/2024

Model Overview

This is an open-source pretrained language model for the medical and biological fields, obtained by merging a general instruction model with a medical specialty model, demonstrating excellent performance on medical datasets such as PubMed.

Model Features

Medical Domain Optimization

Specially optimized for the biomedical domain, performing excellently on medical datasets like PubMed

Multilingual Support

Supports 8 European languages including English

Efficient Merging

Utilizes advanced DARE/TIES model merging methods to incorporate medical expertise while retaining base model capabilities

Open Source Availability

Fully open-source under Apache-2.0 license

Model Capabilities

Medical Text Generation

Biomedical Q&A

Multilingual Text Processing

Medical Knowledge Retrieval

Use Cases

Medical Research

Medical Literature Summarization

Generates concise summaries from medical literature content

Achieves 77.5% accuracy on PubMedQA dataset

Clinical Knowledge Q&A

Answers medical expertise and clinical questions

Achieves 59.9% accuracy in clinical knowledge graph tests

Medical Education

Medical Knowledge Explanation

Generates explanations and teaching materials for medical concepts

🚀 BioMistral-7B-mistral7instruct-dare

This project is a merged pre-trained language model created using mergekit. It combines the power of multiple models to offer enhanced performance in the medical and biological fields.

🚀 Quick Start

You can use BioMistral with Hugging Face's Transformers library as follows.

Basic Usage

from transformers import AutoModel, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("BioMistral/BioMistral-7B")
model = AutoModel.from_pretrained("BioMistral/BioMistral-7B")

✨ Features

Model Merging: Utilizes the DARE TIES merge method to combine multiple pre-trained models.
Multilingual Support: Supports multiple languages including English, French, Dutch, Spanish, Italian, Polish, Romanian, and German.
Medical Domain Adaptation: Tailored for the medical and biological domains, pre-trained on PubMed Central data.

📦 Installation

The installation mainly involves using the transformers library. You can install it via the following command:

pip install transformers

📚 Documentation

Merge Details

Merge Method

This model was merged using the DARE TIES merge method, with mistralai/Mistral-7B-Instruct-v0.1 as the base model.

Models Merged

The following models were included in the merge:

BioMistral/BioMistral-7B

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: mistralai/Mistral-7B-Instruct-v0.1
    # No parameters necessary for base model
  - model: BioMistral/BioMistral-7B
    parameters:
      density: 0.5
      weight: 0.5
merge_method: dare_ties
base_model: mistralai/Mistral-7B-Instruct-v0.1
parameters:
  int8_mask: true
dtype: bfloat16

BioMistral Models

Property	Details
Model Type	A suite of Mistral-based further pre-trained open source models for medical domains.
Training Data	Textual data from PubMed Central Open Access (CC0, CC BY, CC BY-SA, and CC BY-ND).

Model Name	Base Model	Model Type	Sequence Length	Download
BioMistral-7B	Mistral-7B-Instruct-v0.1	Further Pre-trained	2048	HuggingFace
BioMistral-7B-DARE	Mistral-7B-Instruct-v0.1	Merge DARE	2048	HuggingFace
BioMistral-7B-TIES	Mistral-7B-Instruct-v0.1	Merge TIES	2048	HuggingFace
BioMistral-7B-SLERP	Mistral-7B-Instruct-v0.1	Merge SLERP	2048	HuggingFace

Quantized Models

Base Model	Method	q_group_size	w_bit	version	VRAM GB	Time	Download
BioMistral-7B	FP16/BF16				15.02	x1.00	HuggingFace
BioMistral-7B	AWQ	128	4	GEMM	4.68	x1.41	HuggingFace
BioMistral-7B	AWQ	128	4	GEMV	4.68	x10.30	HuggingFace
BioMistral-7B	BnB.4		4		5.03	x3.25	HuggingFace
BioMistral-7B	BnB.8		8		8.04	x4.34	HuggingFace
BioMistral-7B-DARE	AWQ	128	4	GEMM	4.68	x1.41	HuggingFace
BioMistral-7B-TIES	AWQ	128	4	GEMM	4.68	x1.41	HuggingFace
BioMistral-7B-SLERP	AWQ	128	4	GEMM	4.68	x1.41	HuggingFace

Supervised Fine-tuning Benchmark

	Clinical KG	Medical Genetics	Anatomy	Pro Medicine	College Biology	College Medicine	MedQA	MedQA 5 opts	PubMedQA	MedMCQA	Avg.
BioMistral 7B	59.9	64.0	56.5	60.4	59.0	54.7	50.6	42.8	77.5	48.1	57.3
Mistral 7B Instruct	62.9	57.0	55.6	59.4	62.5	57.2	42.0	40.9	75.7	46.1	55.9

BioMistral 7B Ensemble	62.8	62.7	57.5	63.5	64.3	55.7	50.6	43.6	77.5	48.8	58.7
BioMistral 7B DARE	62.3	67.0	55.8	61.4	66.9	58.0	51.1	45.2	77.7	48.7	59.4
BioMistral 7B TIES	60.1	65.0	58.5	60.5	60.4	56.5	49.5	43.2	77.5	48.1	57.9
BioMistral 7B SLERP	62.5	64.7	55.8	62.7	64.8	56.3	50.8	44.3	77.8	48.6	58.8

MedAlpaca 7B	53.1	58.0	54.1	58.8	58.1	48.6	40.1	33.7	73.6	37.0	51.5
PMC-LLaMA 7B	24.5	27.7	35.3	17.4	30.3	23.3	25.5	20.2	72.9	26.6	30.4
MediTron-7B	41.6	50.3	46.4	27.9	44.4	30.8	41.6	28.1	74.9	41.3	42.7
BioMedGPT-LM-7B	51.4	52.0	49.4	53.3	50.7	49.1	42.5	33.9	76.8	37.6	49.7

GPT-3.5 Turbo 1106*	74.71	74.00	65.92	72.79	72.91	64.73	57.71	50.82	72.66	53.79	66.0

Supervised Fine-Tuning (SFT) performance of BioMistral 7B models compared to baselines, measured by accuracy (↑) and averaged across 3 random seeds of 3-shot. DARE, TIES, and SLERP are model merging strategies that combine BioMistral 7B and Mistral 7B Instruct. Best model in bold, and second-best underlined. *GPT-3.5 Turbo performances are reported from the 3-shot results without SFT.

🔧 Technical Details

BioMistral is an open-source LLM tailored for the biomedical domain, utilizing Mistral as its foundation model and further pre-trained on PubMed Central. The model merging process uses the DARE and TIES methods to combine the strengths of different pre-trained models.

📄 License

This project is licensed under the apache-2.0 license.

📖 Citation

Arxiv : https://arxiv.org/abs/2402.10373

@misc{labrak2024biomistral,
      title={BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains}, 
      author={Yanis Labrak and Adrien Bazoge and Emmanuel Morin and Pierre-Antoine Gourraud and Mickael Rouvier and Richard Dufour},
      year={2024},
      eprint={2402.10373},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

⚠️ Important Note

Although BioMistral is intended to encapsulate medical knowledge sourced from high-quality evidence, it hasn't been tailored to effectively, safely, or suitably convey this knowledge within professional parameters for action. We advise refraining from utilizing BioMistral in medical contexts unless it undergoes thorough alignment with specific use cases and undergoes further testing, notably including randomized controlled trials in real-world medical environments. BioMistral 7B may possess inherent risks and biases that have not yet been thoroughly assessed. Additionally, the model's performance has not been evaluated in real-world clinical settings. Consequently, we recommend using BioMistral 7B strictly as a research tool and advise against deploying it in production environments for natural language generation or any professional health and medical purposes.

⚠️ Important Note

Both direct and downstream users need to be informed about the risks, biases, and constraints inherent in the model. While the model can produce natural language text, our exploration of its capabilities and limitations is just beginning. In fields such as medicine, comprehending these limitations is crucial. Hence, we strongly advise against deploying this model for natural language generation in production or for professional tasks in the realm of health and medicine.

drawing

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご