đ Medra27B Quantized Model
This project provides quantized versions of the Medra27B model, offering various quantization options for different use - cases in medical AI and text generation.
đ Quick Start
If you are new to using GGUF files, refer to TheBloke's READMEs for comprehensive details, including instructions on how to concatenate multi - part files.
⨠Features
- Multiple Quantization Options: Offers a wide range of quantized versions sorted by size, suitable for different performance and quality requirements.
- Medical - AI Focus: Based on a medical - related base model, useful for medical text generation, summarization, and diagnostic reasoning.
- Fine - Tuned: The model is fine - tuned, enhancing its performance in specific tasks.
đĻ Installation
No specific installation steps are provided in the original document.
đģ Usage Examples
No code examples are provided in the original document.
đ Documentation
About
The model is a weighted/imatrix quantization of https://huggingface.co/nicoboss/Medra27B. Static quants are available at https://huggingface.co/mradermacher/Medra27B - GGUF.
Provided Quants
The provided quantized models are sorted by size (not necessarily quality). IQ - quants are often preferable over similar - sized non - IQ quants.
Link |
Type |
Size/GB |
Notes |
GGUF |
i1 - IQ1_S |
6.4 |
for the desperate |
GGUF |
i1 - IQ1_M |
6.9 |
mostly desperate |
GGUF |
i1 - IQ2_XXS |
7.8 |
|
GGUF |
i1 - IQ2_XS |
8.5 |
|
GGUF |
i1 - IQ2_S |
8.9 |
|
GGUF |
i1 - IQ2_M |
9.6 |
|
GGUF |
i1 - Q2_K_S |
9.9 |
very low quality |
GGUF |
i1 - Q2_K |
10.6 |
IQ3_XXS probably better |
GGUF |
i1 - IQ3_XXS |
10.8 |
lower quality |
GGUF |
i1 - IQ3_XS |
11.7 |
|
GGUF |
i1 - IQ3_S |
12.3 |
beats Q3_K* |
GGUF |
i1 - Q3_K_S |
12.3 |
IQ3_XS probably better |
GGUF |
i1 - IQ3_M |
12.6 |
|
GGUF |
i1 - Q3_K_M |
13.5 |
IQ3_S probably better |
GGUF |
i1 - Q3_K_L |
14.6 |
IQ3_M probably better |
GGUF |
i1 - IQ4_XS |
14.9 |
|
GGUF |
i1 - Q4_0 |
15.7 |
fast, low quality |
GGUF |
i1 - Q4_K_S |
15.8 |
optimal size/speed/quality |
GGUF |
i1 - Q4_K_M |
16.6 |
fast, recommended |
GGUF |
i1 - Q4_1 |
17.3 |
|
GGUF |
i1 - Q5_K_S |
18.9 |
|
GGUF |
i1 - Q5_K_M |
19.4 |
|
GGUF |
i1 - Q6_K |
22.3 |
practically like static Q6_K |
Here is a useful graph by ikawrakow comparing some lower - quality quant types (lower is better):

And here are Artefact2's thoughts on the matter:
https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9
FAQ / Model Request
For answers to common questions or if you want other models to be quantized, visit https://huggingface.co/mradermacher/model_requests.
đ§ Technical Details
No technical details are provided in the original document.
đ License
The model is licensed under the apache - 2.0 license.
Thanks
I express my gratitude to my company, nethype GmbH, for allowing me to use its servers and upgrading my workstation, which enables me to carry out this work in my free time. I also thank @nicoboss for granting me access to his private supercomputer, which allows me to provide many more imatrix quants with much higher quality than I could otherwise.