Rubert Base Cased Conversational Paraphrase V1
Russian sentence semantic similarity evaluation model based on ruBERT-conversational, trained on multiple paraphrase detection datasets
Downloads 281
Release Time : 7/3/2022
Model Overview
This model is specifically designed for evaluating semantic similarity of Russian sentences, supporting tasks such as paraphrase detection and text classification. Based on the ruBERT-conversational architecture, fine-tuned on three Russian paraphrase detection datasets.
Model Features
Multi-dataset training
Jointly trained on three Russian paraphrase detection datasets: ru_paraphraser, RuPAWS, and detoxification dataset
Semantic similarity evaluation
Accurately evaluates semantic similarity between Russian sentences
Optimized training
Uses Adam optimizer with learning rate 1e-5, batch size 32, trained for 3 epochs
Model Capabilities
Russian sentence similarity calculation
Paraphrase detection
Text classification
Semantic analysis
Use Cases
Natural Language Processing
Paraphrase detection
Identifies Russian sentences with different expressions but the same meaning
Achieves ROC AUC of 0.848 on ru_paraphraser dataset
Q&A systems
Evaluates semantic similarity between user questions and knowledge base questions
Achieves ROC AUC of 0.761 on RuPAWS QQP dataset
Content moderation
Detects semantically similar but differently expressed inappropriate content
Achieves ROC AUC of 0.822 on detoxification dataset
Featured Recommended AI Models
Š 2025AIbase