Model Selection

Cross-language retrieval

# Cross-language retrieval

Multilingual E5 Large Pooled Q8 0 GGUF

Multilingual E5 large pooled model supporting sentence similarity calculation and feature extraction tasks in multiple languages.

Text Embedding Supports Multiple Languages

GIST Embedding V0

GIST-Embedding-v0 is a sentence embedding model based on sentence-transformers, mainly used for sentence similarity calculation and feature extraction tasks.

Text Embedding English

Colqwen2 2b V1.0

A visual retrieval model based on Qwen2-VL-2B-Instruct and ColBERT strategy, capable of generating multi-vector text and image representations

Text-to-Image Supports Multiple Languages

Gte Qwen2 1.5B Instruct GGUF

A 7B-parameter sentence embedding model based on the Qwen2 architecture, specializing in sentence similarity tasks with outstanding performance on the MTEB benchmark.

Large Language Model

Vectorizer.guava

A vectorization tool developed by Sinequa that generates embedding vectors from input paragraphs or queries for sentence similarity calculation and retrieval tasks.

Text Embedding Supports Multiple Languages

Bge Reranker V2 M3 En Ru

This is a streamlined version of BAAI/bge-reranker-v2-m3, retaining only the English and Russian vocabulary, making the model 1.5 times smaller while still generating identical embedding vectors.

Transformers Supports Multiple Languages

Gte Multilingual Mlm Base

mGTE series multilingual text encoder, supporting 75 languages, with a maximum context length of 8192, based on BERT+RoPE+GLU architecture, excelling in GLUE and XTREME-R benchmarks

Large Language Model

Bloomz 560m Retriever V2

A dual encoder based on the Bloomz-560m-dpo-chat model, designed to map articles and queries into the same vector space, supporting cross-language retrieval in French and English.

Transformers Supports Multiple Languages

All Indo E5 Small V4

This is an Indonesian text embedding model based on sentence-transformers, capable of mapping sentences and paragraphs into a 384-dimensional dense vector space, suitable for tasks such as clustering and semantic search.

Sentence Transformers Multilingual E5 Large

This is a multilingual sentence embedding model based on sentence-transformers, capable of mapping text to a 1024-dimensional vector space, suitable for semantic search and clustering tasks.

LEALLA is a collection of lightweight language-agnostic sentence embedding models supporting 109 languages, distilled from LaBSE. Suitable for obtaining multilingual sentence embeddings and bilingual text retrieval.

Text Embedding Supports Multiple Languages

Paraphrase Spanish Distilroberta

A Spanish-English bilingual model based on sentence-transformers that maps text to a 768-dimensional vector space, suitable for semantic search and clustering tasks

Transformers Spanish

somosnlp-hackathon-2022

Cross En It Roberta Sentence Transformer

A sentence embedding model supporting English and Italian, used for generating vector representations of sentences.

Transformers Supports Multiple Languages

T-Systems-onsite

Msmarco MiniLM L12 En De V1

An English-German cross-lingual cross-encoder model trained on the MS Marco passage ranking task, suitable for passage re-ranking in information retrieval scenarios.

Transformers Supports Multiple Languages

Sentence Transformers Multilingual Snli V2 500k

This is a multilingual sentence embedding model based on sentence-transformers, capable of mapping sentences and paragraphs into a 768-dimensional vector space, suitable for tasks such as clustering and semantic search.

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase