MMed-Llama-3-8B-EnIns Open-source Medical Language Model - Supports Medical Q&A Tasks in Multiple Languages

Mmed Llama 3 8B EnIns

Developed by Henrychur

MMedLM is a multilingual language model focusing on the medical field. Based on the Llama 3 architecture, it performs excellently in tasks such as medical Q&A.

Large Language Model

Transformers

Supports Multiple Languages#Medical Q&A #Multilingual Medicine #Instruction Tuning

Downloads 786

Release Time : 5/22/2024

Model Overview

This model is a specialized medical language model based on the Llama 3-8B architecture. It is trained on an English instruction tuning dataset and is particularly good at medical Q&A tasks.

Model Features

Medical Field Optimization

Specifically optimized and trained for the medical field, performing excellently in medical Q&A tasks

Multilingual Support

Supports medical Q&A in multiple languages such as English and Chinese

Instruction Tuning

Further trained on the English instruction tuning dataset of PMC-LLaMA

High-performance Performance

Outperforms similar open-source models in multiple medical benchmarks

Model Capabilities

Medical Q&A

Multiple-choice Question Answering

Medical Knowledge Reasoning

Multilingual Medical Text Understanding

Use Cases

Medical Education

Medical Exam Question Answering

Help medical students answer multiple-choice questions in medical exams

Achieved an accuracy of 65.4% in benchmarks such as MedQA

Clinical Decision Support

Clinical Case Analysis

Provide possible diagnostic suggestions based on symptom descriptions

🚀 MMedLM

The official model weights for "Towards Building Multilingual Language Model for Medicine".

🚀 Quick Start

The model can be loaded as follows:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Henrychur/MMed-Llama-3-8B-EnIns")
model = AutoModelForCausalLM.from_pretrained("Henrychur/MMed-Llama-3-8B-EnIns", torch_dtype=torch.float16)

Inference format is similar to Llama 3-Instruct, you can check our inference code here.
For multiple-choice question and answering tasks, we suggest using the following instruction.

from model import MedS_Llama3 # https://github.com/MAGIC-AI4Med/MedS-Ins/blob/main/Inference/model.py
sdk_api = MedS_Llama3(model_path="Henrychur/MMed-Llama-3-8B-EnIns", gpu_id=0)
INSTRUCTION = "Given a question and a list of options, select the correct answer from the options directly."
input_ = "Question: A mother brings her 3 - week - old infant to the pediatrician's office because she is concerned about his feeding habits. He was born without complications and has not had any medical problems up until this time. However, for the past 4 days, he has been fussy, is regurgitating all of his feeds, and his vomit is yellow in color. On physical exam, the child's abdomen is minimally distended but no other abnormalities are appreciated. Which of the following embryologic errors could account for this presentation?\nOptions: A: Abnormal migration of ventral pancreatic bud\tB: Complete failure of proximal duodenum to recanalize\tC: Abnormal hypertrophy of the pylorus\tD: Failure of lateral body folds to move ventrally and fuse in the midline\t"
results = sdk_api.chat([], input_, INSTRUCTION)
print(results)

✨ Features

This repo contains MMed-Llama 3-8B-EnIns, which is based on MMed-Llama 3-8B. We further fine-tune the model on English instruction fine-tuning dataset(from PMC-LLaMA). We did this for a fair comparison with existing models on commonly-used English benchmarks.

Notice that, MMed-Llama 3-8B-EnIns has only been trained on pmc_llama_instructions, which is an English medical SFT dataset focusing on QA tasks. So this model's ability to respond to multilingual input is still limited.

📚 Documentation

News

[2024.2.21] Our pre-print paper is released on ArXiv. Dive into our findings here.
[2024.2.20] We release MMedLM and MMedLM 2. With an auto-regressive continuous training on MMedC, these models achieve superior performance compared to all other open-source models, even rivaling GPT-4 on MMedBench.
[2023.2.20] We release MMedC, a multilingual medical corpus containing 25.5B tokens.
[2023.2.20] We release MMedBench, a new multilingual medical multi-choice question-answering benchmark with rationale. Check out the leaderboard here.

Evaluation on Commonly-used English Benchmark

The further pretrained MMed-Llama3 also showcases its great performance in the medical domain on different English benchmarks.

Property	Details
License	llama3
Datasets	Henrychur/MMedC, axiong/pmc_llama_instructions
Languages Supported	en, zh, ja, fr, ru, es
Tags	medical
Base Model	Henrychur/MMed-Llama-3-8B
Library Name	transformers

Method	Size	Year	MedQA	MedMCQA	PubMedQA	MMLU_CK	MMLU_MG	MMLU_AN	MMLU_PM	MMLU_CB	MMLU_CM	Avg.
MedAlpaca	7B	2023.3	41.7	37.5	72.8	57.4	69.0	57.0	67.3	65.3	54.3	58.03
PMC-LLaMA	13B	2023.9	56.4	56.0	77.9	-	-	-	-	-	-	-
MEDITRON	7B	2023.11	57.2	59.2	74.4	64.6	59.9	49.3	55.4	53.8	44.8	57.62
Mistral	7B	2023.12	50.8	48.2	75.4	68.7	71.0	55.6	68.4	68.1	59.5	62.97
Gemma	7B	2024.2	47.2	49.0	76.2	69.8	70.0	59.3	66.2	79.9	60.1	64.19
BioMistral	7B	2024.2	50.6	48.1	77.5	59.9	64.0	56.5	60.4	59.0	54.7	58.97
Llama 3	8B	2024.4	60.9	50.7	73.0	72.1	76.0	63.0	77.2	79.9	64.2	68.56
MMed-Llama 3~(Ours)	8B	-	65.4	63.5	80.1	71.3	85.0	69.6	77.6	74.3	66.5	72.59

📄 License

The license for this project is llama3.

Contact

If you have any questions, please feel free to contact qiupengcheng@pjlab.org.cn.

Citation

@misc{qiu2024building,
      title={Towards Building Multilingual Language Model for Medicine}, 
      author={Pengcheng Qiu and Chaoyi Wu and Xiaoman Zhang and Weixiong Lin and Haicheng Wang and Ya Zhang and Yanfeng Wang and Weidi Xie},
      year={2024},
      eprint={2402.13963},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご