T

Text2vec Base Multilingual

Developed by shibing624
A multilingual sentence embedding model supporting Chinese, English, German, French, and other languages, focusing on sentence similarity calculation and feature extraction tasks.
Downloads 128.13k
Release Time : 6/22/2023

Model Overview

This model is based on the Sentence-Transformers framework and trained on multilingual natural language inference datasets. It can convert text into high-quality vector representations, suitable for cross-language semantic similarity calculation and information retrieval tasks.

Model Features

Multilingual Support
Supports text embedding for multiple languages including Chinese, English, German, and French.
High-Performance Sentence Similarity Calculation
Performs excellently in multiple benchmarks, accurately calculating semantic similarity between sentences.
Pre-trained Model
Pre-trained on large-scale multilingual datasets, ready to use out of the box.

Model Capabilities

Sentence similarity calculation
Text feature extraction
Cross-language semantic retrieval
Text classification
Clustering analysis

Use Cases

Information Retrieval
Cross-Language Document Retrieval
Achieves similarity retrieval for documents in different languages using a unified vector space.
Text Classification
Multilingual Sentiment Analysis
Implements sentiment classification for multilingual texts based on sentence embeddings.
Achieves 43.35% accuracy on MTEB EmotionClassification.
Clustering Analysis
Academic Paper Clustering
Performs topic clustering on arXiv papers.
Achieves 32.32 v_measure score on MTEB ArxivClusteringP2P.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase