A

Araeurobert 210M

Developed by Omartificial-Intelligence-Space
Arabic semantic embedding model fine-tuned based on EuroBERT-210m, supporting Matryoshka embedding technology
Downloads 304
Release Time : 3/11/2025

Model Overview

A sentence transformation model optimized for Arabic text, capable of mapping sentences to a 768-dimensional vector space, supporting various embedding dimensions to meet different efficiency needs

Model Features

Matryoshka Embedding Technology
Supports flexible adjustment of embedding dimensions (768/512/256/128/64), balancing performance and efficiency without retraining
Long Text Support
Maximum sequence length of 8,192 tokens, suitable for processing long documents
Arabic Language Optimization
Specifically optimized for Arabic language characteristics, showing significant improvement in STS tasks compared to the base model
Multi-Loss Function Training
Combines MatryoshkaLoss with MultipleNegativesRankingLoss for training

Model Capabilities

Semantic Text Similarity Calculation
Semantic Search
Information Retrieval
Document Clustering
Question Answering Systems
Paraphrase Detection
Zero-Shot Classification

Use Cases

Information Retrieval
Arabic Search Engine
Used to build semantic search engines for Arabic content
Improves relevance and accuracy of search results
Text Analysis
Document Similarity Analysis
Analyzes semantic similarity between Arabic documents
73.5% relative improvement on STS17 task
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase