Llama3-Aloe-8B-Alpha-GGUF Open-source Medical Large Model - Empowering Intelligent Conversational Communication in the Medical Field

Llama3 Aloe 8B Alpha GGUF

Developed by mav23

Aloe is a brand-new family of large medical language models. Through model fusion and advanced prompting strategies, it is highly competitive among open-source models of the same type.

Large Language Model

Transformers

English#Medical Q&A #Safety alignment #Model fusion

Downloads 194

Release Time : 11/11/2024

Model Overview

Aloe is a language model focused on the medical field, featuring high performance, safety, and reliability. It is suitable for medical Q&A and research purposes.

Model Features

Excellent performance

Tested on multiple medical Q&A datasets, it demonstrates competitive performance even when compared to larger models.

Safe and reliable

Through red team testing and alignment optimization, it scores highly on ethical and factual metrics and comes with a healthcare risk assessment.

Easy to use

It provides two inference methods: the pipeline abstraction of the Transformers library and the generate() function combined with the Auto class.

Model Capabilities

Medical Q&A

Text generation

Medical knowledge reasoning

Use Cases

Medical research

Medical Q&A assistance

Used to assist in research and Q&A tasks in the medical field.

Performs excellently on datasets such as MedQA and MedMCQA

🚀 Aloe: A New Family of Healthcare LLMs

Aloe is a new family of healthcare LLMs. It competes well with previous open models in its range. By using model merging and advanced prompting strategies, it achieves state - of - the - art results at its size. It scores high in ethics and factuality metrics due to red teaming and alignment efforts. Complete training details, model merging configurations, and all training data will be shared, along with the prompting repository for inference. Aloe also comes with a healthcare - specific risk assessment for safe use and deployment.

🚀 Quick Start

Use the code below to get started with the model. You can run conversational inference using the Transformers pipeline abstraction, or by leveraging the Auto classes with the generate() function. Let's see examples of both.

💻 Usage Examples

Basic Usage

import transformers
import torch

model_id = "HPAI-BSC/Llama3-Aloe-8B-Alpha"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are an expert medical assistant named Aloe, developed by the High Performance Artificial Intelligence Group at Barcelona Supercomputing Center(BSC). You are to be a helpful, respectful, and honest assistant."},
    {"role": "user", "content": "Hello."},
]

prompt = pipeline.tokenizer.apply_chat_template(
		messages, 
		tokenize=False, 
		add_generation_prompt=True
)

terminators = [
    pipeline.tokenizer.eos_token_id,
    pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = pipeline(
    prompt,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
print(outputs[0]["generated_text"][len(prompt):])

Advanced Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "HPAI-BSC/Llama3-Aloe-8B-Alpha"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are an expert medical assistant named Aloe, developed by the High Performance Artificial Intelligence Group at Barcelona Supercomputing Center(BSC). You are to be a helpful, respectful, and honest assistant."},
    {"role": "user", "content": "Hello"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = model.generate(
    input_ids,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))

✨ Features

Latest Versions Available: The ALOE BETA 8B and ALOE BETA 70B versions offer better overall performance, more thorough alignment and safety, and a license compatible with more uses.
High - Performance in Healthcare: Aloe is highly competitive with previous open models in its range and reaches state - of - the - art results at its size through model merging and advanced prompting strategies.
Ethical and Factual: It scores high in metrics measuring ethics and factuality due to combined red teaming and alignment efforts.
Transparent: Complete training details, model merging configurations, and all training data (including synthetically generated data) will be shared, along with the prompting repository for inference.
Safe Use: Comes with a healthcare - specific risk assessment for safe use and deployment.

📦 Installation

No specific installation steps are provided in the original README.

📚 Documentation

Model Details

Model Description

Property	Details
Developed by	HPAI
Model Type	Causal decoder - only transformer language model
Language(s) (NLP)	English (mainly)
License	This model is based on Meta Llama 3 8B and is governed by the Meta Llama 3 License. All modifications are available with a [CC BY - NC 4.0](https://creativecommons.org/licenses/by - nc/4.0/) license.
Finetuned from model	[meta - llama/Meta - Llama - 3 - 8B](https://huggingface.co/meta - llama/Meta - Llama - 3 - 8B)

Model Sources [optional]

Repository: https://github.com/HPAI - BSC/prompt_engine (more coming soon)
Paper: https://arxiv.org/abs/2405.01886 (more coming soon)

Model Performance

Aloe has been tested on popular healthcare QA datasets, with and without the medprompting inference technique. Results show competitive performance, even against bigger models. Results using advanced prompting methods (aka Medprompt) are achieved through a [repo](https://github.com/HPAI - BSC/prompt_engine) made public with this work.

Uses

Direct Use

We encourage the use of Aloe for research purposes, as a stepping stone to build better foundational models for healthcare.

Out - of - Scope Use

These models are not to be used for clinical practice, medical diagnosis, or any other form of direct or indirect healthcare advice. Models are prone to error and can produce toxic content. The use of Aloe models for activities harmful for individuals, such as spam, fraud, or impersonation, is prohibited.

Bias, Risks, and Limitations

We consider three risk cases:

Healthcare professional impersonation: A model like Aloe could be used to increase the efficacy of such deceiving activities. Preventive actions include public literacy on the unreliability of digitised information and the importance of medical registration, and legislation enforcing AI - generated content disclaimers.
Medical decision - making without professional supervision: Aloe can facilitate self - delusion and generate actionable answers. Public literacy on the dangers of self - diagnosis, along with disclaimers and warnings on the models' outputs, are main defences.
Access to information on dangerous substances or procedures: LLMs can centralize access to such information. Model alignment can help, but jailbreaking methods still overcome it.

The table below shows the performance of Aloe at several AI safety tasks: ![AI Safety Tasks Performance](https://cdn - uploads.huggingface.co/production/uploads/62972c4979f193515da1d38e/T6Jblpf1kmTkM04K716rM.png)

Recommendations

We avoid the use of all personal data in our training. Model safety cannot be guaranteed. Aloe can produce toxic content under the appropriate prompts. Minors should not be left alone to interact with Aloe without supervision.

Training Details

Supervised fine - tuning on top of Llama 3 8B using medical and general domain datasets, model merging using DARE - TIES process, two - stage DPO process for human preference alignment. More details coming soon.

Training Data

Medical domain datasets, including synthetic data generated using Mixtral - 8x7B and Genstruct
- HPAI - BSC/pubmedqa - cot
- HPAI - BSC/medqa - cot
- HPAI - BSC/medmcqa - cot
LDJnr/Capybara
hkust - nlp/deita - 10k - v0
jondurbin/airoboros - 3.2
argilla/dpo - mix - 7k
nvidia/HelpSteer
Custom preference data with adversarial prompts generated from Anthropic Harmless, Chen et al., and original prompts

Evaluation

Testing Data, Factors & Metrics

Testing Data

MedQA (USMLE)
MedMCQA
PubMedQA
MMLU - Medical
[MedQA - 4 - Option](https://huggingface.co/datasets/GBaker/MedQA - USMLE - 4 - options)
[CareQA](https://huggingface.co/datasets/HPAI - BSC/CareQA)

Metrics

Accuracy: suite the evaluation of multiple - choice question - answering tasks.

Results

![Evaluation Results](https://cdn - uploads.huggingface.co/production/uploads/62972c4979f193515da1d38e/STlPSggXr9P9JeWAvmAsi.png)

Summary

To compare Aloe with competitive open models, we use popular healthcare datasets and CareQA. We calculate the standard MultiMedQA score and the arithmetic mean across all datasets. The Medical MMLU is calculated by averaging six medical subtasks.

Benchmark results show that Aloe outperforms Llama3 - 8B - Instruct and larger models like Meditron 70B. With prompting techniques, especially Medprompting, the performance of Llama3 - Aloe - 8B - Alpha is significantly improved.

Environmental Impact

Property	Details
Hardware Type	4xH100
Hours used	7,000
Hardware Provider	Barcelona Supercomputing Center
Compute Region	Spain
Carbon Emitted	439.25kg

Model Card Authors

[Ashwin Kumar Gururajan](https://huggingface.co/G - AshwinKumar)

Model Card Contact

mailto:hpai@bsc.es

Citations

If you use this repository in a published work, please cite the following papers as source:

@misc{gururajan2024aloe,
      title={Aloe: A Family of Fine - tuned Open Healthcare LLMs}, 
      author={Ashwin Kumar Gururajan and Enrique Lopez - Cuena and Jordi Bayarri - Planas and Adrian Tormos and Daniel Hinjos and Pablo Bernabeu - Perez and Anna Arias - Duart and Pablo Agustin Martin - Torres and Lucia Urcelay - Ganzabal and Marta Gonzalez - Mallo and Sergio Alvarez - Napagao and Eduard Ayguadé - Parra and Ulises Cortés Dario Garcia - Gasulla},
      year={2024},
      eprint={2405.01886},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

📄 License

This model is based on Meta Llama 3 8B and is governed by the Meta Llama 3 License. All our modifications are available with a [CC BY - NC 4.0](https://creativecommons.org/licenses/by - nc/4.0/) license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご