Apollo2 7B GGUF
Apollo2-7B-GGUF is a quantized version of FreedomIntelligence/Apollo2-7B, supporting medical large language model applications in multiple languages.
Downloads 111
Release Time : 6/21/2025
Model Overview
This model focuses on the medical field, supporting multiple major languages such as English, Chinese, French, and many minority languages. It is suitable for Q&A on biology, medicine, and other related issues.
Model Features
Multilingual coverage
Supports 12 major languages and 38 minority languages, covering a wide range of medical field applications.
Optimized for the medical field
Focuses on the medical field and is suitable for Q&A on biology, medicine, and other related issues.
Quantized version
A quantized version created using llama.cpp, aiming to provide support for medical large language models in more languages.
Model Capabilities
Multilingual medical Q&A
Medical knowledge reasoning
Cross - language medical information processing
Use Cases
Medical education
Medical knowledge Q&A
Answer medical - related questions, such as disease symptoms and treatment methods.
High - accuracy medical knowledge answers
Clinical support
Clinical decision support
Provide reference information for clinical decision - making.
Assist doctors in clinical decision - making
đ QuantFactory/Apollo2-7B-GGUF
This is a quantized version of FreedomIntelligence/Apollo2-7B created using llama.cpp. It aims to democratize medical LLMs for a wide range of languages.
đ Documentation
Original Model Card
Democratizing Medical LLMs For Much More Languages
It covers 12 major languages including English, Chinese, French, Hindi, Spanish, Arabic, Russian, Japanese, Korean, German, Italian, Portuguese and 38 minor languages so far.
Paper âĸ Demo âĸ ApolloMoEDataset âĸ ApolloMoEBench âĸ Models âĸ Apollo âĸ ApolloMoE
âąī¸ Update
- [2024.10.15] ApolloMoE repo is published.
đ Languages Coverage
It covers 12 major languages and 38 minor languages.
Click to view the Languages Coverage
đī¸ Architecture
Click to view the MoE routing image
đ Results
Dense
Click to view the Dense Models Results
Post-MoE
Click to view the Post-MoE Models Results
đ Usage Format
Apollo2
- 0.5B, 1.5B, 7B: User:{query}\nAssistant:{response}<|endoftext|>
- 2B, 9B: User:{query}\nAssistant:{response}<eos>
- 3.8B: <|user|>\n{query}<|end|><|assisitant|>\n{response}<|end|>
Apollo-MoE
- 0.5B, 1.5B, 7B: User:{query}\nAssistant:{response}<|endoftext|>
đĻ Dataset & Evaluation
Dataset
Click to expand
 - [Data category](https://huggingface.co/datasets/FreedomIntelligence/ApolloCorpus/tree/main/train)Evaluation
Click to expand
- EN: - [MedQA-USMLE](https://huggingface.co/datasets/GBaker/MedQA-USMLE-4-options) - [MedMCQA](https://huggingface.co/datasets/medmcqa/viewer/default/test) - [PubMedQA](https://huggingface.co/datasets/pubmed_qa): Because the results fluctuated too much, they were not used in the paper. - [MMLU-Medical](https://huggingface.co/datasets/cais/mmlu) - Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine - ZH: - [MedQA-MCMLE](https://huggingface.co/datasets/bigbio/med_qa/viewer/med_qa_zh_4options_bigbio_qa/test) - [CMB-single](https://huggingface.co/datasets/FreedomIntelligence/CMB): Not used in the paper - Randomly sample 2,000 multiple-choice questions with single answer. - [CMMLU-Medical](https://huggingface.co/datasets/haonan-li/cmmlu) - Anatomy, Clinical_knowledge, College_medicine, Genetics, Nutrition, Traditional_chinese_medicine, Virology - [CExam](https://github.com/williamliujl/CMExam): Not used in the paper - Randomly sample 2,000 multiple-choice questions - ES: [Head_qa](https://huggingface.co/datasets/head_qa) - FR: - [Frenchmedmcqa](https://github.com/qanastek/FrenchMedMCQA) - [MMLU_FR] - Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine - HI: [MMLU_HI](https://huggingface.co/datasets/FreedomIntelligence/MMLU_Hindi) - Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine - AR: [MMLU_AR](https://huggingface.co/datasets/FreedomIntelligence/MMLU_Arabic) - Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine - JA: [IgakuQA](https://github.com/jungokasai/IgakuQA) - KO: [KorMedMCQA](https://huggingface.co/datasets/sean0042/KorMedMCQA) - IT: - [MedExpQA](https://huggingface.co/datasets/HiTZ/MedExpQA) - [MMLU_IT] - Clinical knowledge, Medical genetics, Anatomy, Professional medicine, College biology, College medicine - DE: [BioInstructQA](https://huggingface.co/datasets/BioMistral/BioInstructQA): German part - PT: [BioInstructQA](https://huggingface.co/datasets/BioMistral/BioInstructQA): Portuguese part - RU: [RuMedBench](https://github.com/sb-ai-lab/MedBench)đĨ Model Download and Inference
We take Apollo-MoE-0.5B as an example:
- Login Huggingface
huggingface-cli login --token $HUGGINGFACE_TOKEN
- Download model to local dir
from huggingface_hub import snapshot_download
import os
local_model_dir=os.path.join('/path/to/models/dir','Apollo-MoE-0.5B')
snapshot_download(repo_id="FreedomIntelligence/Apollo-MoE-0.5B", local_dir=local_model_dir)
- Inference Example
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
import os
local_model_dir=os.path.join('/path/to/models/dir','Apollo-MoE-0.5B')
model=AutoModelForCausalLM.from_pretrained(local_model_dir,trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained(local_model_dir,trust_remote_code=True)
generation_config = GenerationConfig.from_pretrained(local_model_dir, pad_token_id=tokenizer.pad_token_id, num_return_sequences=1, max_new_tokens=7, min_new_tokens=2, do_sample=False, temperature=1.0, top_k=50, top_p=1.0)
inputs = tokenizer('Answer direclty.\nThe capital of Mongolia is Ulaanbaatar.\nThe capital of Iceland is Reykjavik.\nThe capital of Australia is', return_tensors='pt')
inputs = inputs.to(model.device)
pred = model.generate(**inputs,generation_config=generation_config)
print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
đ Results reproduction
Click to expand
We take Apollo2-7B or Apollo-MoE-0.5B as an example: 1. Download Dataset for project: ``` bash 0.download_data.sh ``` 2. Prepare test and dev data for specific model: - Create test data for with special token ``` bash 1.data_process_test&dev.sh ``` 3. Prepare train data for specific model (Create tokenized data in advance): - You can adjust data Training order and Training Epoch in this step ``` bash 2.data_process_train.sh ``` 4. Train the model - If you want to train in Multi Nodes please refer to ./src/sft/training_config/zero_multi.yaml ``` bash 3.single_node_train.sh ``` 5. Evaluate your model: Generate score for benchmark ``` bash 4.eval.sh ```đ License
The model is released under the Apache 2.0 license.
đ Citation
Please use the following citation if you intend to use our dataset for training or evaluation:
@misc{zheng2024efficientlydemocratizingmedicalllms,
title={Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts},
author={Guorui Zheng and Xidong Wang and Juhao Liang and Nuo Chen and Yuping Zheng and Benyou Wang},
year={2024},
eprint={2410.10626},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2410.10626},
}
Property | Details |
---|---|
Model Type | Quantized version of FreedomIntelligence/Apollo2-7B |
Training Data | ApolloMoEDataset |
Metrics | Accuracy |
Base Model | Qwen/Qwen2-7B |
Pipeline Tag | Question-answering |
Tags | Biology, Medical |
Languages | Arabic, English, Chinese, Korean, Japanese, Mongolian, Thai, Vietnamese, Lao, Malagasy, German, Portuguese, Spanish, French, Russian, Italian, Croatian, Galician, Czech, Corsican, Latin, Ukrainian, Bosnian, Bulgarian, Esperanto, Albanian, Danish, Sanskrit, Guarani, Serbian, Slovak, Scottish Gaelic, Luxembourgish, Hindi, Kurdish, Maltese, Hebrew, Lingala, Bambara, Swahili, Igbo, Kinyarwanda, Hausa |
Phi 2 GGUF
Other
Phi-2 is a small yet powerful language model developed by Microsoft, featuring 2.7 billion parameters, focusing on efficient inference and high-quality text generation.
Large Language Model Supports Multiple Languages
P
TheBloke
41.5M
205
Roberta Large
MIT
A large English language model pre-trained with masked language modeling objectives, using improved BERT training methods
Large Language Model English
R
FacebookAI
19.4M
212
Distilbert Base Uncased
Apache-2.0
DistilBERT is a distilled version of the BERT base model, maintaining similar performance while being more lightweight and efficient, suitable for natural language processing tasks such as sequence classification and token classification.
Large Language Model English
D
distilbert
11.1M
669
Llama 3.1 8B Instruct GGUF
Meta Llama 3.1 8B Instruct is a multilingual large language model optimized for multilingual dialogue use cases, excelling in common industry benchmarks.
Large Language Model English
L
modularai
9.7M
4
Xlm Roberta Base
MIT
XLM-RoBERTa is a multilingual model pretrained on 2.5TB of filtered CommonCrawl data across 100 languages, using masked language modeling as the training objective.
Large Language Model Supports Multiple Languages
X
FacebookAI
9.6M
664
Roberta Base
MIT
An English pre-trained model based on Transformer architecture, trained on massive text through masked language modeling objectives, supporting text feature extraction and downstream task fine-tuning
Large Language Model English
R
FacebookAI
9.3M
488
Opt 125m
Other
OPT is an open pre-trained Transformer language model suite released by Meta AI, with parameter sizes ranging from 125 million to 175 billion, designed to match the performance of the GPT-3 series while promoting open research in large-scale language models.
Large Language Model English
O
facebook
6.3M
198
1
A pretrained model based on the transformers library, suitable for various NLP tasks
Large Language Model
Transformers

1
unslothai
6.2M
1
Llama 3.1 8B Instruct
Llama 3.1 is Meta's multilingual large language model series, featuring 8B, 70B, and 405B parameter scales, supporting 8 languages and code generation, with optimized multilingual dialogue scenarios.
Large Language Model
Transformers Supports Multiple Languages

L
meta-llama
5.7M
3,898
T5 Base
Apache-2.0
The T5 Base Version is a text-to-text Transformer model developed by Google with 220 million parameters, supporting multilingual NLP tasks.
Large Language Model Supports Multiple Languages
T
google-t5
5.4M
702
Featured Recommended AI Models
Š 2025AIbase