gte-multilingual-reranker-base Open-Source Model - A Powerful Tool for Text Reordering Supporting 70+ Languages

Gte Multilingual Reranker Base

Developed by Alibaba-NLP

The first multilingual reranking model in the GTE series, supporting 70+ languages with high performance and long text processing capabilities.

Text Embedding

Transformers

Supports Multiple LanguagesOpen Source License:Apache-2.0 #Multilingual Reranking #Long Text Support #Efficient Inference

Downloads 239.91k

Release Time : 7/20/2024

Model Overview

This model is a reranking model in the GTE series, specifically designed for multilingual retrieval tasks, supporting long text input and multilingual processing, suitable for information retrieval and text ranking tasks.

Model Features

High Performance

Achieves state-of-the-art (SOTA) performance in multilingual retrieval tasks and multi-task representation model evaluations.

Efficient Architecture

Uses an encoder-only Transformer architecture, with a small model size and fast inference speed, offering 10x speed improvement compared to pure decoder LLM architectures.

Long Text Support

Supports text input of up to 8192 tokens.

Multilingual Capability

Supports over 70 languages.

Model Capabilities

Multilingual Text Ranking

Long Text Processing

Information Retrieval

Use Cases

Information Retrieval

Multilingual Document Ranking

Ranks multilingual documents by relevance to improve retrieval effectiveness.

Achieves SOTA performance in multilingual retrieval tasks.

Question Answering Systems

Ranks candidate answers in QA systems to improve answer quality.

🚀 gte-multilingual-reranker-base

The gte-multilingual-reranker-base model is the first reranker model in the GTE family. It offers high - performance multilingual retrieval capabilities and has several key advantages:

High Performance: It achieves state - of - the - art (SOTA) results in multilingual retrieval tasks and multi - task representation model evaluations compared to similar - sized reranker models.
Training Architecture: Trained with an encoder - only transformers architecture, it has a smaller model size. Unlike previous models based on decode - only LLM architecture (e.g., gte - qwen2 - 1.5b - instruct), it has lower hardware requirements for inference and offers a 10x increase in inference speed.
Long Context: It supports text lengths up to 8192 tokens.
Multilingual Capability: It supports over 70 languages.

✨ Features

High Performance: Achieves state - of - the - art results in multilingual retrieval and multi - task representation model evaluations among similar - sized reranker models.
Training Architecture: Uses an encoder - only transformers architecture, resulting in a smaller model size and lower hardware requirements for inference, with a 10x increase in inference speed compared to some previous models.
Long Context: Supports text up to 8192 tokens.
Multilingual Capability: Supports over 70 languages.

📦 Installation

⚠️ Important Note

It is recommended to install xformers and enable unpadding for acceleration. Refer to [enable - unpadding - and - xformers](https://huggingface.co/Alibaba - NLP/new - impl#recommendation - enable - unpadding - and - acceleration - with - xformers).

💡 Usage Tip

For offline usage, refer to [new - impl/discussions/2](https://huggingface.co/Alibaba - NLP/new - impl/discussions/2#662b08d04d8c3d0a09c88fa3).

💻 Usage Examples

Basic Usage

Using Huggingface transformers (transformers>=4.36.0)

import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer

model_name_or_path = "Alibaba-NLP/gte-multilingual-reranker-base"

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
model = AutoModelForSequenceClassification.from_pretrained(
    model_name_or_path, trust_remote_code=True,
    torch_dtype=torch.float16
)
model.eval()

pairs = [["中国的首都在哪儿","北京"], ["what is the capital of China?", "北京"], ["how to implement quick sort in python?","Introduction of quick sort"]]
with torch.no_grad():
    inputs = tokenizer(pairs, padding=True, truncation=True, return_tensors='pt', max_length=512)
    scores = model(**inputs, return_dict=True).logits.view(-1, ).float()
    print(scores)

# tensor([1.2315, 0.5923, 0.3041])

Advanced Usage

Usage with infinity:

Infinity, a MIT Licensed Inference RestAPI Server.

docker run --gpus all -v $PWD/data:/app/.cache -p "7997":"7997" \
michaelf34/infinity:0.0.68 \
v2 --model-id Alibaba-NLP/gte-multilingual-reranker-base --revision "main" --dtype bfloat16 --batch-size 32 --device cuda --engine torch --port 7997

📚 Documentation

Model Information

Property	Details
Model Size	306M
Max Input Tokens	8192

Evaluation

Results of reranking based on multiple text retrieval datasets

More detailed experimental results can be found in the paper.

Cloud API Services

In addition to the open - source [GTE](https://huggingface.co/collections/Alibaba - NLP/gte - models - 6680f0b13f885cb431e6d469) series models, GTE series models are also available as commercial API services on Alibaba Cloud.

[Embedding Models](https://help.aliyun.com/zh/model - studio/developer - reference/general - text - embedding/): Three versions of the text embedding models are available: text - embedding - v1/v2/v3, with v3 being the latest API service.
[ReRank Models](https://help.aliyun.com/zh/model - studio/developer - reference/general - text - sorting - model/): The gte - rerank model service is available.

Note that the models behind the commercial APIs are not entirely identical to the open - source models.

📄 License

This model is released under the Apache 2.0 license.

🔧 Technical Details

The gte - multilingual - reranker - base model is trained using an encoder - only transformers architecture. This architecture choice results in a smaller model size compared to some previous models. It has lower hardware requirements for inference, which is beneficial for deployment in various environments. The model supports text lengths up to 8192 tokens, making it suitable for handling long - context text. It also supports over 70 languages, enabling multilingual text retrieval tasks.

📖 Citation

If you find our paper or models helpful, please consider citing:

@inproceedings{zhang2024mgte,
  title={mGTE: Generalized Long - Context Text Representation and Reranking Models for Multilingual Text Retrieval},
  author={Zhang, Xin and Zhang, Yanzhao and Long, Dingkun and Xie, Wen and Dai, Ziqi and Tang, Jialong and Lin, Huan and Yang, Baosong and Xie, Pengjun and Huang, Fei and others},
  booktitle={Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track},
  pages={1393--1412},
  year={2024}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご