🚀 Felladrin/bge-reranker-v2-m3-Q8_0-GGUF
This model is a conversion to the GGUF format from the original BAAI/bge-reranker-v2-m3
, offering enhanced usability and compatibility.
📦 Installation
Install llama.cpp via Brew
You can install llama.cpp through brew, which works on both Mac and Linux systems.
brew install llama.cpp
💻 Usage Examples
Use with llama.cpp
CLI Usage
llama-cli --hf-repo Felladrin/bge-reranker-v2-m3-Q8_0-GGUF --hf-file bge-reranker-v2-m3-q8_0.gguf -p "The meaning to life and the universe is"
Server Usage
llama-server --hf-repo Felladrin/bge-reranker-v2-m3-Q8_0-GGUF --hf-file bge-reranker-v2-m3-q8_0.gguf -c 2048
Alternative Usage Steps from Llama.cpp Repo
Step 1: Clone llama.cpp from GitHub
git clone https://github.com/ggerganov/llama.cpp
Step 2: Build llama.cpp
Move into the llama.cpp folder and build it with the LLAMA_CURL=1
flag, along with other hardware - specific flags (e.g., LLAMA_CUDA=1
for Nvidia GPUs on Linux).
cd llama.cpp && LLAMA_CURL=1 make
Step 3: Run Inference
You can run inference through the main binary using either the CLI or the server.
CLI Inference
./llama-cli --hf-repo Felladrin/bge-reranker-v2-m3-Q8_0-GGUF --hf-file bge-reranker-v2-m3-q8_0.gguf -p "The meaning to life and the universe is"
Server Inference
./llama-server --hf-repo Felladrin/bge-reranker-v2-m3-Q8_0-GGUF --hf-file bge-reranker-v2-m3-q8_0.gguf -c 2048
📚 Documentation
This model was converted to GGUF format from BAAI/bge-reranker-v2-m3
using llama.cpp via the ggml.ai's GGUF-my-repo space. For more details on the model, refer to the original model card.
📄 License
This model is under the apache-2.0
license.
📦 Model Information
Property |
Details |
Base Model |
BAAI/bge-reranker-v2-m3 |
Language |
multilingual |
Pipeline Tag |
text-ranking |
Tags |
transformers, sentence-transformers, text-embeddings-inference, llama-cpp, gguf-my-repo |
Library Name |
sentence-transformers |