gguf-jina-reranker-v1-tiny-en Open Source Model - Ultra-fast Re-ranking, Supports Long Text Processing

Gguf Jina Reranker V1 Tiny En

Developed by Felladrin

A model specifically designed for ultra-fast reranking, based on the JinaBERT architecture, supporting long text sequence processing (up to 8,192 tokens).

Text Embedding EnglishOpen Source License:Apache-2.0 #Ultra-fast Reranking #Long Text Processing #Knowledge Distillation

Downloads 3,831

Release Time : 1/25/2025

Model Overview

This model achieves fast reranking through knowledge distillation technology, prioritizing speed while maintaining competitive performance, suitable for scenarios where absolute accuracy is not critical.

Model Features

Ultra-fast Reranking

Achieves the fastest inference speed with a 4-layer architecture and 33.0 million parameters.

Long Text Processing

Supports sequence lengths of up to 8,192 tokens, outperforming traditional reranking models.

Knowledge Distillation Technology

Extracts knowledge from a more complex teacher model (jina-reranker-v1-base-en) to maintain competitive performance.

Model Capabilities

Text Reranking

Long Text Sequence Processing

Fast Inference

Use Cases

Information Retrieval

Search Result Reranking

Relevance reranking of search engine results

Improves hit rate of top 3 results to 85%

Recommendation Systems

🚀 jina-reranker-v1-tiny-en-GGUF

This model is a GGUF - quantized version of jina-reranker-v1-tiny-en, designed for extremely fast reranking with competitive performance. It's based on the powerful JinaBERT model and can handle long - text sequences up to 8,192 tokens.

🚀 Quick Start

Use Jina AI's Reranker API

The simplest way to use the jina-reranker-v1-tiny-en model is through Jina AI's Reranker API. Here is an example of a cURL request:

curl https://api.jina.ai/v1/rerank \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
  "model": "jina-reranker-v1-tiny-en",
  "query": "Organic skincare products for sensitive skin",
  "documents": [
    "Eco-friendly kitchenware for modern homes",
    "Biodegradable cleaning supplies for eco-conscious consumers",
    "Organic cotton baby clothes for sensitive skin",
    "Natural organic skincare range for sensitive skin",
    "Tech gadgets for smart homes: 2024 edition",
    "Sustainable gardening tools and compost solutions",
    "Sensitive skin-friendly facial cleansers and toners",
    "Organic food wraps and storage solutions",
    "All-natural pet food for dogs with allergies",
    "Yoga mats made from recycled materials"
  ],
  "top_n": 3
}'

Use the `sentence-transformers` Library

You can also use the sentence-transformers>=0.27.0 library. First, install it via pip:

pip install -U sentence-transformers

Then, use the following Python code to interact with the model:

from sentence_transformers import CrossEncoder

# Load the model, here we use our tiny sized model
model = CrossEncoder("jinaai/jina-reranker-v1-tiny-en", trust_remote_code=True)

# Example query and documents
query = "Organic skincare products for sensitive skin"
documents = [
    "Eco-friendly kitchenware for modern homes",
    "Biodegradable cleaning supplies for eco-conscious consumers",
    "Organic cotton baby clothes for sensitive skin",
    "Natural organic skincare range for sensitive skin",
    "Tech gadgets for smart homes: 2024 edition",
    "Sustainable gardening tools and compost solutions",
    "Sensitive skin-friendly facial cleansers and toners",
    "Organic food wraps and storage solutions",
    "All-natural pet food for dogs with allergies",
    "Yoga mats made from recycled materials"
]

results = model.rank(query, documents, return_documents=True, top_k=3)

Use the `transformers` Library

You can use the transformers library to interact with the model programmatically. Install it first:

!pip install transformers

Then, use the following code:

from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained(
    'jinaai/jina-reranker-v1-tiny-en', num_labels=1, trust_remote_code=True
)

# Example query and documents
query = "Organic skincare products for sensitive skin"
documents = [
    "Eco-friendly kitchenware for modern homes",
    "Biodegradable cleaning supplies for eco-conscious consumers",
    "Organic cotton baby clothes for sensitive skin",
    "Natural organic skincare range for sensitive skin",
    "Tech gadgets for smart homes: 2024 edition",
    "Sustainable gardening tools and compost solutions",
    "Sensitive skin-friendly facial cleansers and toners",
    "Organic food wraps and storage solutions",
    "All-natural pet food for dogs with allergies",
    "Yoga mats made from recycled materials"
]

# construct sentence pairs
sentence_pairs = [[query, doc] for doc in documents]

scores = model.compute_score(sentence_pairs)

Use the `transformers.js` Library

You can run the model directly in JavaScript (in - browser, Node.js, Deno, etc.) using the transformers.js library. Install it from NPM:

npm i @xenova/transformers

Then, use the following code:

import { AutoTokenizer, AutoModelForSequenceClassification } from '@xenova/transformers';

const model_id = 'jinaai/jina-reranker-v1-tiny-en';
const model = await AutoModelForSequenceClassification.from_pretrained(model_id, { quantized: false });
const tokenizer = await AutoTokenizer.from_pretrained(model_id);

/**
 * Performs ranking with the CrossEncoder on the given query and documents. Returns a sorted list with the document indices and scores.
 * @param {string} query A single query
 * @param {string[]} documents A list of documents
 * @param {Object} options Options for ranking
 * @param {number} [options.top_k=undefined] Return the top-k documents. If undefined, all documents are returned.
 * @param {number} [options.return_documents=false] If true, also returns the documents. If false, only returns the indices and scores.
 */
async function rank(query, documents, {
    top_k = undefined,
    return_documents = false,
} = {}) {
    const inputs = tokenizer(
        new Array(documents.length).fill(query),
        { text_pair: documents, padding: true, truncation: true }
    )
    const { logits } = await model(inputs);
    return logits.sigmoid().tolist()
        .map(([score], i) => ({
            corpus_id: i,
            score,
            ...(return_documents ? { text: documents[i] } : {})
        })).sort((a, b) => b.score - a.score).slice(0, top_k);
}

// Example usage:
const query = "Organic skincare products for sensitive skin"
const documents = [
    "Eco-friendly kitchenware for modern homes",
    "Biodegradable cleaning supplies for eco-conscious consumers",
    "Organic cotton baby clothes for sensitive skin",
    "Natural organic skincare range for sensitive skin",
    "Tech gadgets for smart homes: 2024 edition",
    "Sustainable gardening tools and compost solutions",
    "Sensitive skin-friendly facial cleansers and toners",
    "Organic food wraps and storage solutions",
    "All-natural pet food for dogs with allergies",
    "Yoga mats made from recycled materials",
]

const results = await rank(query, documents, { return_documents: true, top_k: 3 });
console.log(results);

✨ Features

Blazing - fast Reranking: Utilizes knowledge distillation to achieve high - speed reranking.
Competitive Performance: Maintains good performance in reranking tasks.
Long - text Handling: Based on JinaBERT, it can process text sequences up to 8,192 tokens.

📦 Installation

sentence-transformers: pip install -U sentence-transformers
transformers: pip install transformers
transformers.js: npm i @xenova/transformers

📚 Documentation

Model Information

Property	Details
Model Creator	Jina AI
Original Model	jina-reranker-v1-tiny-en
GGUF Quantization	Based on llama.cpp release f4d2b

Reranker Models Comparison

Model Name	Layers	Hidden Size	Parameters (Millions)
jina-reranker-v1-base-en	12	768	137.0
jina-reranker-v1-turbo-en	6	384	37.8
jina-reranker-v1-tiny-en	4	384	33.0

Evaluation Results

Model Name	NDCG@10 (17 BEIR datasets)	NDCG@10 (5 LoCo datasets)	Hit Rate (LlamaIndex RAG)
`jina-reranker-v1-base-en`	52.45	87.31	85.53
`jina-reranker-v1-turbo-en`	49.60	69.21	85.13
`jina-reranker-v1-tiny-en` (you are here)	48.54	70.29	85.00
`mxbai-rerank-base-v1`	49.19	-	82.50
`mxbai-rerank-xsmall-v1`	48.80	-	83.69
`ms-marco-MiniLM-L-6-v2`	48.64	-	82.63
`ms-marco-MiniLM-L-4-v2`	47.81	-	83.82
`bge-reranker-base`	47.89	-	83.03

⚠️ Important Note

NDCG@10 is a measure of ranking quality, with higher scores indicating better search results. Hit Rate measures the percentage of relevant documents that appear in the top 10 search results.

The results of LoCo datasets on other models are not available since they do not support long documents more than 512 tokens.

For more details, please refer to our benchmarking sheets.

📄 License

This model is licensed under the apache-2.0 license.

📞 Contact

Join our Discord community and chat with other community members about ideas.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご