Granite-embedding-107m-multilingual-GGUF Open-source Model - Supports Retrieval and Information Extraction in 17 Languages

Granite Embedding 107m Multilingual GGUF

Developed by bartowski

A quantized version of the multilingual embedding model developed by the IBM Granite team, supporting text embedding tasks in 17 languages, suitable for scenarios such as retrieval and information extraction.

Text Embedding Supports Multiple LanguagesOpen Source License:Apache-2.0 #Multilingual embedding #Retrieval optimization #Lightweight model

Downloads 15.19k

Release Time : 12/18/2024

Model Overview

This model is a lightweight multilingual embedding model based on 107M parameters. After quantization using the llama.cpp tool, it can run efficiently in resource - constrained environments. The tokenizer is specially optimized and supports multiple quantization format options.

Model Features

Multilingual support

Supports text embedding in 17 languages, including major languages such as English, Chinese, and Arabic

Quantization optimization

Provides 15 quantization versions from f16 to IQ3_M, allowing users to choose the best balance according to device performance

Lightweight and efficient

Only 107M parameters, and the smallest quantized version is only 0.12GB, suitable for deployment on mobile and edge devices

Retrieval optimization

Performs excellently in the MIRACL multilingual retrieval benchmark test, especially good at Telugu (te) and Thai (th)

Model Capabilities

Multilingual text embedding

Cross - language information retrieval

Semantic similarity calculation

Deployment in low - resource environments

Use Cases

Information retrieval

Multilingual document search

Build a document retrieval system supporting 17 languages

Reached ndcg@10 = 0.78175 on the Telugu test set

Cross - language content recommendation

Recommend relevant foreign - language content based on the user's native language

The recall@100 of cross - language retrieval from Chinese to English reached 0.87388

Semantic analysis

Multilingual clustering analysis

Perform semantic clustering on mixed - language content

🚀 Llamacpp Static Quantizations of granite-embedding-107m-multilingual

This project provides Llama.cpp static quantizations of the granite-embedding-107m-multilingual model. It uses llama.cpp for quantization and allows you to run the quantized models in LM Studio.

🚀 Quick Start

Quantization: We used llama.cpp release b4381 for quantization.
Original Model: You can find the original model at https://huggingface.co/ibm-granite/granite-embedding-107m-multilingual.
Running the Model: You can run the quantized models in LM Studio.

✨ Features

Multilingual Support: The model supports multiple languages including English, Arabic, Czech, German, Spanish, French, Italian, Japanese, Korean, Dutch, Portuguese, and Chinese.
Multiple Quantization Types: Various quantization types are available, allowing you to choose based on your needs for quality and file size.

📚 Documentation

Prompt format

No prompt format found, check original model page.

What's new

Fix tokenizer.

Embed/output weights

Some of these quants (Q3_K_XL, Q4_K_L etc) are the standard quantization method with the embeddings and output weights quantized to Q8_0 instead of what they would normally default to.

📦 Installation

Download a file (not the whole branch) from below

Filename	Quant type	File Size	Split	Description
granite-embedding-107m-multilingual-f16.gguf	f16	0.22GB	false	Full F16 weights.
granite-embedding-107m-multilingual-Q8_0.gguf	Q8_0	0.12GB	false	Extremely high quality, generally unneeded but max available quant.
granite-embedding-107m-multilingual-Q6_K_L.gguf	Q6_K_L	0.12GB	false	Uses Q8_0 for embed and output weights. Very high quality, near perfect, recommended.
granite-embedding-107m-multilingual-Q6_K.gguf	Q6_K	0.12GB	false	Very high quality, near perfect, recommended.
granite-embedding-107m-multilingual-Q5_K_L.gguf	Q5_K_L	0.12GB	false	Uses Q8_0 for embed and output weights. High quality, recommended.
granite-embedding-107m-multilingual-Q5_K_M.gguf	Q5_K_M	0.12GB	false	High quality, recommended.
granite-embedding-107m-multilingual-Q5_K_S.gguf	Q5_K_S	0.12GB	false	High quality, recommended.
granite-embedding-107m-multilingual-Q4_K_L.gguf	Q4_K_L	0.12GB	false	Uses Q8_0 for embed and output weights. Good quality, recommended.
granite-embedding-107m-multilingual-Q4_K_M.gguf	Q4_K_M	0.12GB	false	Good quality, default size for most use cases, recommended.
granite-embedding-107m-multilingual-Q4_K_S.gguf	Q4_K_S	0.12GB	false	Slightly lower quality with more space savings, recommended.
granite-embedding-107m-multilingual-Q4_0.gguf	Q4_0	0.12GB	false	Legacy format, offers online repacking for ARM and AVX CPU inference.
granite-embedding-107m-multilingual-IQ4_NL.gguf	IQ4_NL	0.12GB	false	Similar to IQ4_XS, but slightly larger. Offers online repacking for ARM CPU inference.
granite-embedding-107m-multilingual-IQ4_XS.gguf	IQ4_XS	0.12GB	false	Decent quality, smaller than Q4_K_S with similar performance, recommended.
granite-embedding-107m-multilingual-Q3_K_XL.gguf	Q3_K_XL	0.12GB	false	Uses Q8_0 for embed and output weights. Lower quality but usable, good for low RAM availability.
granite-embedding-107m-multilingual-Q3_K_L.gguf	Q3_K_L	0.12GB	false	Lower quality but usable, good for low RAM availability.
granite-embedding-107m-multilingual-Q3_K_M.gguf	Q3_K_M	0.12GB	false	Low quality.
granite-embedding-107m-multilingual-IQ3_M.gguf	IQ3_M	0.12GB	false	Medium-low quality, new method with decent performance comparable to Q3_K_M.

Downloading using huggingface-cli

Click to view download instructions

First, make sure you have huggingface-cli installed. Then you can use the following commands to download the files.

🔧 Technical Details

Model Information

Property	Details
Model Type	Llama.cpp static quantizations of granite-embedding-107m-multilingual
Training Data	Not provided

Evaluation Metrics

The model was evaluated on the Miracl dataset for retrieval tasks. Here are the evaluation metrics for different languages:

Miracl (en)

Metric	Value
ndcg_at_1	0.41176
ndcg_at_10	0.46682
ndcg_at_100	0.54326
ndcg_at_1000	0.56567
ndcg_at_20	0.50157
ndcg_at_3	0.41197
ndcg_at_5	0.42086
recall_at_1	0.19322
recall_at_10	0.57721
recall_at_100	0.83256
recall_at_1000	0.95511
recall_at_20	0.6757
recall_at_3	0.37171
recall_at_5	0.44695

Miracl (ar)

Metric	Value
ndcg_at_1	0.55559
ndcg_at_10	0.62541
ndcg_at_100	0.67101
ndcg_at_1000	0.6805
ndcg_at_20	0.64739
ndcg_at_3	0.56439
ndcg_at_5	0.59347
recall_at_1	0.37009
recall_at_10	0.73317
recall_at_100	0.90066
recall_at_1000	0.96272
recall_at_20	0.80205
recall_at_3	0.56903
recall_at_5	0.6518

... (similar tables for other languages)

📄 License

This project is licensed under the Apache 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご