đ Llamacpp Static Quantizations of granite-embedding-107m-multilingual
This project provides Llama.cpp static quantizations of the granite-embedding-107m-multilingual
model. It uses llama.cpp
for quantization and allows you to run the quantized models in LM Studio.
đ Quick Start
⨠Features
- Multilingual Support: The model supports multiple languages including English, Arabic, Czech, German, Spanish, French, Italian, Japanese, Korean, Dutch, Portuguese, and Chinese.
- Multiple Quantization Types: Various quantization types are available, allowing you to choose based on your needs for quality and file size.
đ Documentation
Prompt format
No prompt format found, check original model page.
What's new
Fix tokenizer.
Embed/output weights
Some of these quants (Q3_K_XL, Q4_K_L etc) are the standard quantization method with the embeddings and output weights quantized to Q8_0 instead of what they would normally default to.
đĻ Installation
Download a file (not the whole branch) from below
Filename |
Quant type |
File Size |
Split |
Description |
granite-embedding-107m-multilingual-f16.gguf |
f16 |
0.22GB |
false |
Full F16 weights. |
granite-embedding-107m-multilingual-Q8_0.gguf |
Q8_0 |
0.12GB |
false |
Extremely high quality, generally unneeded but max available quant. |
granite-embedding-107m-multilingual-Q6_K_L.gguf |
Q6_K_L |
0.12GB |
false |
Uses Q8_0 for embed and output weights. Very high quality, near perfect, recommended. |
granite-embedding-107m-multilingual-Q6_K.gguf |
Q6_K |
0.12GB |
false |
Very high quality, near perfect, recommended. |
granite-embedding-107m-multilingual-Q5_K_L.gguf |
Q5_K_L |
0.12GB |
false |
Uses Q8_0 for embed and output weights. High quality, recommended. |
granite-embedding-107m-multilingual-Q5_K_M.gguf |
Q5_K_M |
0.12GB |
false |
High quality, recommended. |
granite-embedding-107m-multilingual-Q5_K_S.gguf |
Q5_K_S |
0.12GB |
false |
High quality, recommended. |
granite-embedding-107m-multilingual-Q4_K_L.gguf |
Q4_K_L |
0.12GB |
false |
Uses Q8_0 for embed and output weights. Good quality, recommended. |
granite-embedding-107m-multilingual-Q4_K_M.gguf |
Q4_K_M |
0.12GB |
false |
Good quality, default size for most use cases, recommended. |
granite-embedding-107m-multilingual-Q4_K_S.gguf |
Q4_K_S |
0.12GB |
false |
Slightly lower quality with more space savings, recommended. |
granite-embedding-107m-multilingual-Q4_0.gguf |
Q4_0 |
0.12GB |
false |
Legacy format, offers online repacking for ARM and AVX CPU inference. |
granite-embedding-107m-multilingual-IQ4_NL.gguf |
IQ4_NL |
0.12GB |
false |
Similar to IQ4_XS, but slightly larger. Offers online repacking for ARM CPU inference. |
granite-embedding-107m-multilingual-IQ4_XS.gguf |
IQ4_XS |
0.12GB |
false |
Decent quality, smaller than Q4_K_S with similar performance, recommended. |
granite-embedding-107m-multilingual-Q3_K_XL.gguf |
Q3_K_XL |
0.12GB |
false |
Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability. |
granite-embedding-107m-multilingual-Q3_K_L.gguf |
Q3_K_L |
0.12GB |
false |
Lower quality but usable, good for low RAM availability. |
granite-embedding-107m-multilingual-Q3_K_M.gguf |
Q3_K_M |
0.12GB |
false |
Low quality. |
granite-embedding-107m-multilingual-IQ3_M.gguf |
IQ3_M |
0.12GB |
false |
Medium-low quality, new method with decent performance comparable to Q3_K_M. |
Downloading using huggingface-cli
Click to view download instructions
First, make sure you have huggingface-cli
installed. Then you can use the following commands to download the files.
đ§ Technical Details
Model Information
Property |
Details |
Model Type |
Llama.cpp static quantizations of granite-embedding-107m-multilingual |
Training Data |
Not provided |
Evaluation Metrics
The model was evaluated on the Miracl dataset for retrieval tasks. Here are the evaluation metrics for different languages:
Miracl (en)
Metric |
Value |
ndcg_at_1 |
0.41176 |
ndcg_at_10 |
0.46682 |
ndcg_at_100 |
0.54326 |
ndcg_at_1000 |
0.56567 |
ndcg_at_20 |
0.50157 |
ndcg_at_3 |
0.41197 |
ndcg_at_5 |
0.42086 |
recall_at_1 |
0.19322 |
recall_at_10 |
0.57721 |
recall_at_100 |
0.83256 |
recall_at_1000 |
0.95511 |
recall_at_20 |
0.6757 |
recall_at_3 |
0.37171 |
recall_at_5 |
0.44695 |
Miracl (ar)
Metric |
Value |
ndcg_at_1 |
0.55559 |
ndcg_at_10 |
0.62541 |
ndcg_at_100 |
0.67101 |
ndcg_at_1000 |
0.6805 |
ndcg_at_20 |
0.64739 |
ndcg_at_3 |
0.56439 |
ndcg_at_5 |
0.59347 |
recall_at_1 |
0.37009 |
recall_at_10 |
0.73317 |
recall_at_100 |
0.90066 |
recall_at_1000 |
0.96272 |
recall_at_20 |
0.80205 |
recall_at_3 |
0.56903 |
recall_at_5 |
0.6518 |
... (similar tables for other languages)
đ License
This project is licensed under the Apache 2.0 license.