B

Bloomz 560m Retriever V2

Developed by cmarkea
A dual encoder based on the Bloomz-560m-dpo-chat model, designed to map articles and queries into the same vector space, supporting cross-language retrieval in French and English.
Downloads 17
Release Time : 5/26/2024

Model Overview

This model is a dual encoder specifically designed for Open-Domain Question Answering (ODQA) tasks, capable of mapping queries and relevant articles into the same vector space to ensure proximity between queries and relevant articles. It supports cross-language retrieval in French and English.

Model Features

Cross-language retrieval
Supports cross-language retrieval in French and English, allowing queries in either language to find relevant articles regardless of the article's language.
Efficient retrieval
Uses cosine distance as the metric, significantly improving retrieval efficiency.
Contrastive learning training
Trained with contrastive learning using an improved mMARCO dataset, filtering false-negative samples and employing a hard negative sampling strategy.

Model Capabilities

Feature extraction
Cross-language retrieval
Open-domain question answering

Use Cases

Information retrieval
Open-domain question answering
Used in open-domain question answering systems to quickly retrieve relevant articles for answering questions.
Performs excellently on the SQuAD test set, achieving Top-1 accuracy of 68% (Fr/Fr) and 66.6% (En/Fr).
Cross-language document retrieval
Supports cross-language document retrieval between French and English.
Outperforms traditional models like BM25 and CamemBERT in cross-language retrieval tasks.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase