T

Tooka SBERT V2 Small

Developed by PartAI
Tooka-SBERT-V2-Small is a trained sentence transformer model for semantic text similarity and embedding tasks. It can map sentences and paragraphs to a dense vector space where semantically similar texts are close to each other.
Downloads 110
Release Time : 5/13/2025

Model Overview

This model is specifically designed to handle semantic similarity and embedding tasks for Persian texts, and its performance is optimized through two-stage training (pretraining and fine-tuning).

Model Features

Two-stage training
The model goes through two stages of pretraining and fine-tuning, which are optimized on the Targoman News dataset and multiple synthetic datasets respectively.
Asymmetric input processing
It supports adding specific prefixes (such as 'سوال:' and 'متن:') before input to distinguish different types of texts and optimize semantic understanding.
Efficient performance
It performs excellently on the PTEB Benchmark, and its average performance is better than that of the mE5-Base model.

Model Capabilities

Semantic text similarity calculation
Text embedding generation
Persian text processing

Use Cases

Information retrieval
Document retrieval
Use the embeddings generated by the model for document similarity search
It performs well on datasets such as MIRACLRetrieval
Text classification
Sentiment analysis
Use text embeddings for sentiment classification
It is effective in tasks such as PersianFoodSentimentClassification
Re-ranking
Search result optimization
Perform semantic re-ranking on the initial retrieval results
It performs excellently in tasks such as WikipediaRerankingMultilingual
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase