🚀 SentenceTransformer based on NAMAA-Space/AraModernBert-Base-V1.0
This SentenceTransformer is fine - tuned from [NAMAA - Space/AraModernBert - Base - V1.0](https://huggingface.co/NAMAA - Space/AraModernBert - Base - V1.0), offering powerful Arabic embeddings suitable for various use cases.
This SentenceTransformer provides 768 - dimensional dense vectors. It excels in semantic similarity, search, paraphrase mining, clustering, text classification, and more. It is optimized for speed and efficiency without sacrificing performance. Whether you're building intelligent search engines, chatbots, or AI - powered knowledge graphs, this model can deliver precise and in - depth representations of Arabic text. Try it out to take Arabic NLP to the next level! 🔥✨
🚀 Quick Start
This SentenceTransformer is fine - tuned from [NAMAA - Space/AraModernBert - Base - V1.0](https://huggingface.co/NAMAA - Space/AraModernBert - Base - V1.0), offering strong Arabic embeddings useful for multiple use cases.
✨ Features
- 🔹 768 - dimensional dense vectors 🎯
- 🔹 Excels in: Semantic Similarity, Search, Paraphrase Mining, Clustering, Text Classification & More!
- 🔹 Optimized for speed & efficiency without sacrificing performance
📦 Installation
First, you need to install the Sentence Transformers library. You can do this using the following command:
pip install -U sentence-transformers
💻 Usage Examples
Basic Usage
After installing the library, you can load the model and run inference.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("NAMAA-Space/AraModernBert-Base-STS")
sentences = [
'الذكاء الاصطناعي يغير طريقة تفاعلنا مع التكنولوجيا.',
'التكنولوجيا تتطور بسرعة بفضل الذكاء الاصطناعي.',
'الذكاء الاصطناعي يسهم في تطوير التطبيقات الذكية.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
📚 Documentation
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Evaluation
Metrics
Semantic Similarity
Metric |
STS17 |
STS22.v2 |
pearson_cosine |
0.8249 |
0.5259 |
spearman_cosine |
0.831 |
0.6169 |
Framework Versions
Property |
Details |
Python |
3.10.12 |
Sentence Transformers |
3.4.1 |
Transformers |
4.49.0 |
PyTorch |
2.1.0+cu118 |
Accelerate |
1.4.0 |
Datasets |
2.21.0 |
Tokenizers |
0.21.0 |
📄 License
This project is licensed under the apache - 2.0
license.
📄 Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al - Rfou and Brian Strope and Yun - hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}