Bloomz-3b-reranking Open-Source Reranking Model - Measuring Semantic Relevance between Queries and Contexts across English and French Bilinguals

Bloomz 3b Reranking

Developed by cmarkea

A cross-lingual reranking model based on Bloomz-3b, designed to measure semantic relevance between queries and contexts, supporting French and English.

Large Language Model

Transformers

Supports Multiple LanguagesOpen Source License:Openrail #Cross-lingual Reranking #Semantic Relevance Scoring #Open-domain Question Answering

Downloads 115

Release Time : 3/15/2024

Model Overview

This model aims to filter query/context matching results in open-domain question answering scenarios through standardized scoring and rerank the results more efficiently than traditional retrievers. Suitable for cross-lingual scenarios, it effectively handles text reranking tasks in French and English.

Model Features

Multilingual Support

Supports French and English, excels in cross-lingual scenarios without being affected by single-language behavior.

Efficient Reranking

Efficiently filters query/context matching results through standardized scoring, more precise than traditional retrievers.

High Precision

Performs excellently in both monolingual and cross-lingual evaluations, achieving Top-1 accuracy of over 89%.

Model Capabilities

Semantic Relevance Scoring

Cross-lingual Text Reranking

Open-domain Question Answering Reranking

Use Cases

Information Retrieval

Open-domain Question Answering System

Used to rerank query/context matching results output by retrievers, improving the accuracy of question answering systems.

Top-1 accuracy: 89.37% (French/French), 89.20% (French/English)

Multilingual Applications

Cross-lingual Document Retrieval

Supports cross-lingual document retrieval and reranking in French and English.

MRR score: 93.79 (French/French), 93.63 (French/English)

🚀 Bloomz-3b Reranking

This reranking model measures the semantic correspondence between a question and a context, supporting both French and English. It helps filter and reorder results in ODQA contexts, though it has high computational cost.

✨ Features

Built from the cmarkea/bloomz-3b-dpo-chat model.
Language-agnostic, supporting both French and English.
Can effectively score in a cross - language context.
Helps filter and reorder results in an ODQA context.

📦 Installation

No installation steps are provided in the original README.

💻 Usage Examples

Basic Usage

from transformers import pipeline

reranker = pipeline(
    task='text-classification',
    model='cmarkea/bloomz-3b-reranking',
    top_k=None
)

query: str
contexts: List[str]

similarities = reranker(
    [
        dict(
            text=context, # the model was trained with context in `text`
            text_pair=query # and query in `text_pair` argument.
        )
        for context in contexts
    ]
)

score_label_1 = [
    next(item['score'] for item in entry if item['label'] == 'LABEL_1') 
    for entry in similarities
]
contexts_reranked = sorted(
    zip(score_label_1, contexts),
    key=lambda x: x[0],
    reverse=True
)

score, contexts_cleaned = zip(
    *filter(
        lambda x: x[0] >= 0.8,
        contexts_reranked
    )
)

📚 Documentation

Dataset

The training dataset is composed of the mMARCO dataset, consisting of query/positive/hard negative triplets. Additionally, we have included SQuAD data from the "train" split, forming query/positive/hard negative triplets. To generate hard negative data for SQuAD, we considered contexts from the same theme as the query but from a different set of queries. The negative observations belong to the same themes as the queries but presumably do not contain the answer to the question.

Finally, the triplets are flattened to obtain pairs of query/context sentences with a label 1 if query/positive and a label 0 if query/negative. In each element of the pair (query and context), the language, French or English, is randomly and uniformly chosen.

Evaluation

To assess the performance of the reranker, we use the "validation" split of the SQuAD dataset. We select the first question from each paragraph, along with the paragraph constituting the context that should be ranked Top - 1 for an Oracle modeling. The number of themes is limited, and each context from a corresponding theme that does not match the query is considered as a hard negative (other contexts outside the theme are simple negatives).

The evaluation corpus consists of 1204 pairs of query/context to be ranked.

Evaluation in the same language (French/French)

Model (French/French)	Top - mean	Top - std	Top - 1 (%)	Top - 10 (%)	Top - 100 (%)	MRR (x100)	mean score Top	std score Top
BM25	14.47	92.19	69.77	92.03	98.09	77.74	NA	NA
CamemBERT	5.72	36.88	69.35	95.51	98.92	79.51	0.83	0.37
DistilCamemBERT	5.54	25.90	66.11	92.77	99.17	76.00	0.80	0.39
mMiniLMv2 - L12	4.43	30.27	71.51	95.68	99.42	80.17	0.78	0.38
RoBERTa (multilingual)	15.13	60.39	57.23	83.87	96.18	66.21	0.53	0.11
cmarkea/bloomz - 560m - reranking	1.49	2.58	83.55	99.17	100	89.98	0.93	0.15
cmarkea/bloomz - 3b - reranking	1.22	1.06	89.37	99.75	100	93.79	0.94	0.10

Evaluation in cross - language context (French/English)

Model (French/English)	Top - mean	Top - std	Top - 1 (%)	Top - 10 (%)	Top - 100 (%)	MRR (x100)	mean score Top	std score Top
BM25	288.04	371.46	21.93	41.93	55.15	28.41	NA	NA
CamemBERT	12.20	61.39	59.55	89.71	97.42	70.38	0.65	0.47
DistilCamemBERT	40.97	104.78	25.66	64.78	88.62	38.83	0.53	0.49
mMiniLMv2 - L12	6.91	32.16	59.88	89.95	99.09	70.39	0.61	0.46
RoBERTa (multilingual)	79.32	153.62	27.91	49.50	78.16	35.41	0.40	0.12
cmarkea/bloomz - 560m - reranking	1.51	1.92	81.89	99.09	100	88.64	0.92	0.15
cmarkea/bloomz - 3b - reranking	1.22	0.98	89.20	99.84	100	93.63	0.94	0.10

As observed, the cross - language context does not significantly impact the behavior of our models. If the model were used in a context of reranking and filtering the Top - K results from a search, a threshold of 0.8 could be applied to filter the contexts outputted by the retriever, thereby reducing noise issues present in the contexts for RAG - type applications.

📄 License

The model is licensed under the bigscience - bloom - rail - 1.0 license.

📚 Citation

@online{DeBloomzReranking,
  AUTHOR = {Cyrile Delestre},
  ORGANIZATION = {Cr{\'e}dit Mutuel Ark{\'e}a},
  URL = {https://huggingface.co/cmarkea/bloomz-3b-reranking},
  YEAR = {2024},
  KEYWORDS = {NLP ; Transformers ; LLM ; Bloomz},
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご