Tooka SBERT
This is a Persian sentence embedding model based on TookaBERT-Large, which maps text to a 1024-dimensional vector space for tasks such as semantic similarity calculation.
Downloads 2,847
Release Time : 12/3/2024
Model Overview
This model is a sentence transformer specifically designed for Persian, capable of converting sentences and paragraphs into dense vector representations, suitable for tasks like semantic text similarity, semantic search, text classification, and clustering.
Model Features
Persian Optimization
Specifically optimized for Persian text, accurately capturing Persian semantic features.
Efficient Similarity Calculation
Uses cosine similarity to quickly compute semantic similarity between sentences.
Large-scale Pretraining
Based on the TookaBERT-Large pretrained model, with strong semantic representation capabilities.
Model Capabilities
Semantic text similarity calculation
Semantic search
Paraphrase mining
Text classification
Text clustering
Use Cases
Information Retrieval
Similar Document Retrieval
Find semantically similar documents in a Persian document repository.
Content Recommendation
Related Content Recommendation
Recommend semantically similar Persian content based on user browsing history.
Text Analysis
Text Clustering Analysis
Perform automatic clustering analysis on Persian text.
Featured Recommended AI Models