🚀 MPNet base trained on AllNLI-turkish triplets
This is a sentence-transformers model fine-tuned from microsoft/mpnet-base on the AllNLI-turkish triplets dataset. It maps sentences and paragraphs to a 768-dimensional dense vector space, which can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
🚀 Quick Start
This model is a sentence-transformers model finetuned from microsoft/mpnet-base on the all-nli-triplets-turkish dataset. It can map sentences & paragraphs to a 768 - dimensional dense vector space and is applicable for various tasks such as semantic textual similarity, semantic search, paraphrase mining, text classification, and clustering.
✨ Features
- Maps sentences and paragraphs to a 768 - dimensional dense vector space.
- Applicable for multiple NLP tasks including semantic similarity, search, and classification.
📦 Installation
First, you need to install the Sentence Transformers library:
pip install -U sentence-transformers
💻 Usage Examples
Basic Usage
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("mertcobanov/mpnet-base-all-nli-triplet-turkish-v3")
sentences = [
'Ağaçlarla çevrili bulvar denize üç bloktan daha az uzanıyor.',
'Deniz üç sokak bile uzakta değil.',
'Denize ulaşmak için caddeden iki mil yol almanız gerekiyor.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
📚 Documentation
Model Details
Model Description
Property |
Details |
Model Type |
Sentence Transformer |
Base model |
microsoft/mpnet-base |
Maximum Sequence Length |
512 tokens |
Output Dimensionality |
768 dimensions |
Similarity Function |
Cosine Similarity |
Training Dataset |
all-nli-triplets-turkish |
Language |
en |
License |
apache - 2.0 |
Model Sources
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: MPNetModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Evaluation
Metrics
Triplet
- Datasets:
all-nli-dev-turkish
and all-nli-test-turkish
- Evaluated with
TripletEvaluator
Metric |
all-nli-dev-turkish |
all-nli-test-turkish |
cosine_accuracy |
0.7423 |
0.7503 |
Training Details
Training Dataset
all-nli-triplets-turkish
Evaluation Dataset
all-nli-triplets-turkish
- Dataset: all-nli-triplets-turkish at bff203b
- Size: 6,584 evaluation samples
- Columns:
anchor_translated
, positive_translated
, and negative_translated
- Approximate statistics based on the first 1000 samples:
|
anchor_translated |
positive_translated |
negative_translated |
type |
string |
string |
string |
details |
- min: 5 tokens
- mean: 42.62 tokens
- max: 192 tokens
|
- min: 5 tokens
- mean: 22.58 tokens
- max: 77 tokens
|
- min: 5 tokens
- mean: 22.07 tokens
- max: 65 tokens
|
- Samples:
anchor_translated |
positive_translated |
negative_translated |
Ayrıca, bu özel tüketim vergileri, diğer vergiler gibi, hükümetin ödeme zorunluluğunu sağlama yetkisini kullanarak belirlenir. |
Hükümetin ödeme zorlaması, özel tüketim vergilerinin nasıl hesaplandığını belirler. |
Özel tüketim vergileri genel kuralın bir istisnasıdır ve aslında GSYİH payına dayalı olarak belirlenir. |
Gri bir sweatshirt giymiş bir sanatçı, canlı renklerde bir kasaba tablosu üzerinde çalışıyor. |
Bir ressam gri giysiler içinde bir kasabanın resmini yapıyor. |
Bir kişi bir beyzbol sopası tutuyor ve gelen bir atış için planda bekliyor. |
İmkansız. |
Yapılamaz. |
Tamamen mümkün. |
📄 License
This project is licensed under the apache - 2.0 license.