đ Colbertv2.0 Quantized Model
This project provides static quantizations of the ColBERTv2.0 model, offering various quantization types for different use cases.
đ Quick Start
If you are new to using this quantized model, the following sections will guide you through its details and usage.
⨠Features
- Multiple Quantization Types: Offers a range of quantization types such as Q2_K, Q3_K_S, IQ4_XS, etc., sorted by size.
- Sentence Transformer: Suitable for tasks like sentence similarity and feature extraction.
- Based on ColBERT: Built upon the ColBERT architecture.
đĻ Installation
No specific installation steps are provided in the original README. If you need to use the GGUF files, refer to TheBloke's READMEs for more details, including how to concatenate multi - part files.
đ Documentation
About
This is a static quantization of https://huggingface.co/lightonai/colbertv2.0. Weighted/imatrix quants seem not to be available (by me) at this time. If they do not show up a week or so after the static ones, I have probably not planned for them. Feel free to request them by opening a Community Discussion.
Usage
If you are unsure how to use GGUF files, refer to one of TheBloke's READMEs for more details, including on how to concatenate multi - part files.
Provided Quants
The provided quantizations are sorted by size, not necessarily quality. IQ - quants are often preferable over similar sized non - IQ quants.
Link |
Type |
Size/GB |
Notes |
GGUF |
Q2_K |
0.2 |
|
GGUF |
Q3_K_S |
0.2 |
|
GGUF |
Q3_K_M |
0.2 |
lower quality |
GGUF |
IQ4_XS |
0.2 |
|
GGUF |
Q3_K_L |
0.2 |
|
GGUF |
Q4_K_S |
0.2 |
fast, recommended |
GGUF |
Q4_K_M |
0.2 |
fast, recommended |
GGUF |
Q5_K_S |
0.2 |
|
GGUF |
Q5_K_M |
0.2 |
|
GGUF |
Q6_K |
0.2 |
very good quality |
GGUF |
Q8_0 |
0.2 |
fast, best quality |
GGUF |
f16 |
0.3 |
16 bpw, overkill |
Here is a handy graph by ikawrakow comparing some lower - quality quant types (lower is better):

And here are Artefact2's thoughts on the matter:
https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9
FAQ / Model Request
See https://huggingface.co/mradermacher/model_requests for some answers to questions you might have and/or if you want some other model quantized.
Thanks
I thank my company, nethype GmbH, for letting me use its servers and providing upgrades to my workstation to enable this work in my free time.
đ License
This project is licensed under the MIT license.