Text2vec Base Multilingual
T
Text2vec Base Multilingual
Developed by shibing624
A multilingual sentence embedding model supporting Chinese, English, German, French, and other languages, focusing on sentence similarity calculation and feature extraction tasks.
Downloads 128.13k
Release Time : 6/22/2023
Model Overview
This model is based on the Sentence-Transformers framework and trained on multilingual natural language inference datasets. It can convert text into high-quality vector representations, suitable for cross-language semantic similarity calculation and information retrieval tasks.
Model Features
Multilingual Support
Supports text embedding for multiple languages including Chinese, English, German, and French.
High-Performance Sentence Similarity Calculation
Performs excellently in multiple benchmarks, accurately calculating semantic similarity between sentences.
Pre-trained Model
Pre-trained on large-scale multilingual datasets, ready to use out of the box.
Model Capabilities
Sentence similarity calculation
Text feature extraction
Cross-language semantic retrieval
Text classification
Clustering analysis
Use Cases
Information Retrieval
Cross-Language Document Retrieval
Achieves similarity retrieval for documents in different languages using a unified vector space.
Text Classification
Multilingual Sentiment Analysis
Implements sentiment classification for multilingual texts based on sentence embeddings.
Achieves 43.35% accuracy on MTEB EmotionClassification.
Clustering Analysis
Academic Paper Clustering
Performs topic clustering on arXiv papers.
Achieves 32.32 v_measure score on MTEB ArxivClusteringP2P.
Featured Recommended AI Models