đ Model Card for Fine-Tuned BERT for Paraphrase Detection
This is a fine-tuned BERT-base model for paraphrase detection. It's trained on four benchmark datasets (MRPC, QQP, PAWS-X, and PIT). The model is useful for applications like duplicate content detection, question answering, and semantic similarity analysis, with strong recall capabilities for identifying paraphrases in complex sentence structures.
đ Quick Start
To use the model, install transformers and load the fine-tuned model as follows:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
model_path = "viswadarshan06/pd-bert"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSequenceClassification.from_pretrained(model_path)
inputs = tokenizer("The car is fast.", "The vehicle moves quickly.", return_tensors="pt", padding=True, truncation=True)
outputs = model(**inputs)
logits = outputs.logits
predicted_class = logits.argmax().item()
print("Paraphrase" if predicted_class == 1 else "Not a Paraphrase")
⨠Features
- Direct Use:
- Identify duplicate questions in customer support and FAQs.
- Improve semantic search in retrieval-based systems.
- Enhance document deduplication and text similarity applications.
- Downstream Use: Can be further fine-tuned on domain - specific paraphrase datasets for industries like healthcare, legal, and finance.
đĻ Installation
To use the model, you need to install the transformers
library. You can install it via pip:
pip install transformers
đ Documentation
Model Description
- Developed by: Viswadarshan R R
- Model Type: Transformer-based Sentence Pair Classifier
- Language: English
- Finetuned from:
bert-base-cased
Model Sources
- Repository: Hugging Face Model Hub
- Research Paper: Comparative Insights into Modern Architectures for Paraphrase Detection (Accepted at ICCIDS 2025)
- Demo: (To be added upon deployment)
Uses
Direct Use
- Identifying duplicate questions in customer support and FAQs.
- Improving semantic search in retrieval-based systems.
- Enhancing document deduplication and text similarity applications.
Downstream Use
This model can be further fine-tuned on domain-specific paraphrase datasets for industries such as healthcare, legal, and finance.
Out-of-Scope Use
- The model is monolingual and trained only on English datasets, requiring additional fine-tuning for multilingual tasks.
- May struggle with idiomatic expressions or complex figurative language.
Bias, Risks, and Limitations
Known Limitations
- Higher recall but lower precision: The model tends to over-identify paraphrases, leading to increased false positives.
- Contextual ambiguity: May misinterpret sentences that require deep contextual reasoning.
Recommendations
Users can mitigate the false positive rate by applying post-processing techniques or confidence threshold tuning.
đ§ Technical Details
Training Details
This model was trained using a combination of four datasets:
- MRPC: News-based paraphrases.
- QQP: Duplicate question detection.
- PAWS-X: Adversarial paraphrases for robustness testing.
- PIT: Short-text paraphrase dataset.
Training Procedure
- Tokenizer: BERT Tokenizer
- Batch Size: 16
- Optimizer: AdamW
- Loss Function: Cross-entropy
Training Hyperparameters
- Learning Rate: 2e-5
- Sequence Length:
- MRPC: 256
- QQP: 336
- PIT: 64
- PAWS-X: 256
Speeds, Sizes, Times
- GPU Used: NVIDIA A100
- Total Training Time: ~6 hours
- Compute Units Used: 80
Testing Data, Factors & Metrics
Testing Data
The model was tested on combined test sets and evaluated using:
- Accuracy
- Precision
- Recall
- F1-Score
- Runtime
Results
BERT Model Evaluation Metrics
Model |
Dataset |
Accuracy (%) |
Precision (%) |
Recall (%) |
F1-Score (%) |
Runtime (sec) |
BERT |
MRPC Validation |
88.24 |
88.37 |
95.34 |
91.72 |
1.41 |
BERT |
MRPC Test |
84.87 |
85.84 |
92.50 |
89.04 |
5.77 |
BERT |
QQP Validation |
87.92 |
81.44 |
86.86 |
84.06 |
43.24 |
BERT |
QQP Test |
88.14 |
82.49 |
86.56 |
84.47 |
43.51 |
BERT |
PAWS-X Validation |
91.90 |
87.57 |
94.67 |
90.98 |
6.73 |
BERT |
PAWS-X Test |
92.60 |
88.69 |
95.92 |
92.16 |
6.82 |
BERT |
PIT Validation |
77.38 |
72.41 |
58.57 |
64.76 |
4.34 |
BERT |
PIT Test |
86.16 |
64.11 |
76.57 |
69.79 |
0.98 |
Summary
This BERT-based Paraphrase Detection Model demonstrates strong recall capabilities, making it highly effective at identifying paraphrases across varied linguistic structures. While it tends to overpredict paraphrases, it remains a strong baseline for semantic similarity tasks and can be fine-tuned further for domain-specific applications.
Citation
If you use this model, please cite:
@inproceedings{viswadarshan2025paraphrase,
title={Comparative Insights into Modern Architectures for Paraphrase Detection},
author={Viswadarshan R R, Viswaa Selvam S, Felcia Lilian J, Mahalakshmi S},
booktitle={International Conference on Computational Intelligence, Data Science, and Security (ICCIDS)},
year={2025},
publisher={IFIP AICT Series by Springer}
}
đ License
This project is licensed under the MIT license.
Model Card Contact
đ§ Email: viswadarshanrramiya@gmail.com
đ GitHub: Viswadarshan R R