Langcache Crossencoder V1 Ms Marco MiniLM L6 V2
This is a model based on the Cross Encoder architecture, specifically designed for text pair classification tasks. It is fine-tuned on the Quora question pair dataset and is suitable for semantic similarity judgment and semantic search scenarios.
Downloads 338
Release Time : 6/19/2025
Model Overview
This model is based on the Cross Encoder architecture and can effectively handle text pair classification tasks. It is fine-tuned on the Quora question pair dataset and is mainly used to calculate the similarity scores of text pairs.
Model Features
Efficient text pair processing
Based on the Cross Encoder architecture, specifically optimized for the performance of text pair classification tasks
Fine-tuning on Quora dataset
Specifically fine-tuned on the Quora question pair dataset, especially suitable for semantic similarity judgment in Q&A scenarios
Multi-metric evaluation
Supports multiple evaluation metrics such as accuracy and F1 score, enabling comprehensive evaluation of model performance
Model Capabilities
Text pair similarity calculation
Semantic similarity judgment
Q&A pair matching
Semantic search
Use Cases
Q&A system
Similar question identification
Identify whether the question raised by the user is similar to existing questions
Accuracy: 69.56%, F1 score: 59.47%
Semantic search
Search result sorting
Sort search results based on the semantic similarity between the query and the document
🚀 Redis semantic caching CrossEncoder model fine-tuned on Quora Question Pairs
This is a Cross Encoder model that computes scores for pairs of texts, which can be used for sentence pair classification. It is fine-tuned from cross-encoder/ms-marco-MiniLM-L6-v2 on the Quora Question Pairs LangCache Train Set dataset using the sentence-transformers library.
✨ Features
- Cross-Encoder Classification: Capable of performing cross-encoder classification tasks.
- Multiple Metrics Evaluation: Evaluated with metrics such as accuracy, F1, precision, recall, etc.
- Fine-tuned on Quora Data: Trained on the Quora Question Pairs dataset for better performance in relevant scenarios.
📦 Installation
First, install the Sentence Transformers library:
pip install -U sentence-transformers
💻 Usage Examples
Basic Usage
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("aditeyabaral-redis/langcache-crossencoder-v1-ms-marco-MiniLM-L6-v2")
# Get scores for pairs of texts
pairs = [
['How can I get a list of my Gmail accounts?', 'How can I find all my old Gmail accounts?'],
['How can I stop Quora from modifying and editing other people’s questions on Quora?', 'Can I prevent a Quora user from editing my question on Quora?'],
['How much does it cost to design a logo in india?', 'How much does it cost to design a logo?'],
['What is screenedrenters.com?', 'What is allmyapps.com?'],
['What are the best colleges for an MBA in Australia?', 'What are the top MBA schools in Australia?'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)
Advanced Usage
# Rank different texts based on similarity to a single text
ranks = model.rank(
'How can I get a list of my Gmail accounts?',
[
'How can I find all my old Gmail accounts?',
'Can I prevent a Quora user from editing my question on Quora?',
'How much does it cost to design a logo?',
'What is allmyapps.com?',
'What are the top MBA schools in Australia?',
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
📚 Documentation
Model Details
Model Description
Property | Details |
---|---|
Model Type | Cross Encoder |
Base model | cross-encoder/ms-marco-MiniLM-L6-v2 |
Maximum Sequence Length | 512 tokens |
Number of Output Labels | 1 label |
Training Dataset | Quora Question Pairs LangCache Train Set |
Language | en |
License | apache-2.0 |
Model Sources
- Documentation: Sentence Transformers Documentation
- Documentation: Cross Encoder Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Cross Encoders on Hugging Face
Evaluation
Metrics
Metric | Value |
---|---|
accuracy | 0.6956 |
accuracy_threshold | 4.1688 |
f1 | 0.5947 |
f1_threshold | 3.3412 |
precision | 0.4834 |
recall | 0.7727 |
average_precision | 0.6229 |
Training Details
Training Dataset
- Quora Question Pairs LangCache Train Set
- Size: 363,861 training samples
- Columns:
sentence1
,sentence2
, andlabel
- Approximate statistics based on the first 1000 samples:
sentence1 sentence2 label type string string int details - min: 15 characters
- mean: 60.22 characters
- max: 229 characters
- min: 14 characters
- mean: 60.0 characters
- max: 274 characters
- 0: ~63.50%
- 1: ~36.50%
- Samples:
sentence1 sentence2 label Why do people believe in God and how can they say he/she exists?
Why do we kill each other in the name of God?
0
What are the chances of a bee sting when a bee buzzes around you?
How can I tell if my bees are agitated/likely to sting?
0
If a man from Syro Malankara church marries a Syro-Malabar girl, can they join a Syro-Malabar parish?
Is Malabar Hills of Mumbai anyhow related to Malabar of Kerala?
0
- Loss:
BinaryCrossEntropyLoss
with these parameters:
{ "activation_fn": "torch.nn.modules.linear.Identity", "pos_weight": null }
Evaluation Dataset
- Quora Question Pairs LangCache Validation Set
- Size: 40,429 evaluation samples
- Columns:
sentence1
,sentence2
, andlabel
- Approximate statistics based on the first 1000 samples:
sentence1 sentence2 label type string string int details - min: 13 characters
- mean: 59.91 characters
- max: 266 characters
- min: 13 characters
- mean: 59.51 characters
- max: 293 characters
- 0: ~63.80%
- 1: ~36.20%
- Samples:
sentence1 sentence2 label How can I get a list of my Gmail accounts?
How can I find all my old Gmail accounts?
1
How can I stop Quora from modifying and editing other people’s questions on Quora?
Can I prevent a Quora user from editing my question on Quora?
1
How much does it cost to design a logo in india?
How much does it cost to design a logo?
0
- Loss:
BinaryCrossEntropyLoss
with these parameters:
{ "activation_fn": "torch.nn.modules.linear.Identity", "pos_weight": null }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 64per_device_eval_batch_size
: 64learning_rate
: 0.0002num_train_epochs
: 15load_best_model_at_end
: Truepush_to_hub
: Truehub_model_id
: aditeyabaral-redis/langcache-crossencoder-v1-ms-marco-MiniLM-L6-v2
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 64per_device_eval_batch_size
: 64per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 0.0002weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 15max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Trueresume_from_checkpoint
: Nonehub_model_id
: aditeyabaral-redis/langcache-crossencoder-v1-ms-marco-MiniLM-L6-v2hub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Click to expand
Epoch | Step | Training Loss | Validation Loss | quora-eval_average_precision |
---|---|---|---|---|
0.0879 | 500 | 0.3913 | 0.3302 | 0.5603 |
0.1759 | 1000 | 0.3408 | 0.3220 | 0.5932 |
0.2638 | 1500 | 0.3318 | 0.3249 | 0.6144 |
0.3517 | 2000 | 0.3235 | 0.3027 | 0.6280 |
0.4397 | 2500 | 0.3173 | 0.2944 | 0.6233 |
0.5276 | 3000 | 0.3049 | 0.3009 | 0.6685 |
0.6155 | 3500 | 0.3071 | 0.2908 | 0.6221 |
0.7035 | 4000 | 0.3015 | 0.2854 | 0.6143 |
0.7914 | 4500 | 0.2944 | 0.2759 | 0.6361 |
0.8794 | 5000 | 0.2984 | 0.2854 | 0.6616 |
0.9673 | 5500 | 0.2898 | 0.3002 | 0.6109 |
1.0552 | 6000 | 0.2552 | 0.2800 | 0.6466 |
1.1432 | 6500 | 0.2352 | 0.2821 | 0.6305 |
1.2311 | 7000 | 0.2366 | 0.2778 | 0.5699 |
1.3190 | 7500 | 0.2332 | 0.2831 | 0.6076 |
1.4070 | 8000 | 0.2366 | 0.2783 | 0.6003 |
1.4949 | 8500 | 0.2391 | 0.2716 | 0.6195 |
1.5828 | 9000 | 0.241 | 0.2685 | 0.6229 |
1.6708 | 9500 | 0.2359 | 0.2804 | 0.6410 |
1.7587 | 10000 | 0.2374 | 0.2819 | 0.6448 |
1.8466 | 10500 | 0.2387 | 0.2750 | 0.6479 |
1.9346 | 11000 | 0.2343 | 0.2734 | 0.6034 |
2.0225 | 11500 | 0.2193 | 0.3168 | 0.6384 |
2.1104 | 12000 | 0.1741 | 0.3011 | 0.6189 |
2.1984 | 12500 | 0.1732 | 0.2988 | 0.6412 |
2.2863 | 13000 | 0.1814 | 0.2839 | 0.6156 |
2.3743 | 13500 | 0.1815 | 0.2930 | 0.5520 |
2.4622 | 14000 | 0.1774 | 0.3461 | 0.6195 |
2.5501 | 14500 | 0.1886 | 0.3033 | 0.6113 |
2.6381 | 15000 | 0.1831 | 0.2925 | 0.5815 |
2.7260 | 15500 | 0.1889 | 0.2801 | 0.5701 |
2.8139 | 16000 | 0.1869 | 0.2893 | 0.6090 |
2.9019 | 16500 | 0.1896 | 0.3038 | 0.6142 |
2.9898 | 17000 | 0.1967 | 0.2791 | 0.5967 |
3.0777 | 17500 | 0.1395 | 0.3119 | 0.5672 |
3.1657 | 18000 | 0.1392 | 0.3052 | 0.5876 |
3.2536 | 18500 | 0.1411 | 0.3030 | 0.6064 |
3.3415 | 19000 | 0.1356 | 0.3064 | 0.5535 |
3.4295 | 19500 | 0.14 | 0.3144 | 0.5978 |
3.5174 | 20000 | 0.1461 | 0.3332 | 0.5961 |
3.6053 | 20500 | 0.1468 | 0.3179 | 0.5975 |
3.6933 | 21000 | 0.1487 | 0.3327 | 0.5932 |
3.7812 | 21500 | 0.1479 | 0.3340 | 0.5888 |
3.8692 | 22000 | 0.1458 | 0.3172 | 0.5478 |
3.9571 | 22500 | 0.1566 | 0.3036 | 0.5926 |
4.0450 | 23000 | 0.1257 | 0.3552 | 0.5941 |
4.1330 | 23500 | 0.1004 | 0.3886 | 0.5067 |
4.2209 | 24000 | 0.1061 | 0.3682 | 0.5654 |
4.3088 | 24500 | 0.1087 | 0.3212 | 0.5556 |
4.3968 | 25000 | 0.11 | 0.3348 | 0.5628 |
4.4847 | 25500 | 0.1108 | 0.3740 | 0.5046 |
4.5726 | 26000 | 0.1169 | 0.3092 | 0.5882 |
4.6606 | 26500 | 0.1156 | 0.3498 | 0.4988 |
4.7485 | 27000 | 0.1232 | 0.3042 | 0.5801 |
4.8364 | 27500 | 0.1253 | 0.3042 | 0.5801 |
4.9243 | 28000 | 0.1274 | 0.3042 | 0.5801 |
5.0123 | 28500 | 0.1295 | 0.3042 | 0.5801 |
5.1002 | 29000 | 0.1316 | 0.3042 | 0.5801 |
5.1881 | 29500 | 0.1337 | 0.3042 | 0.5801 |
5.2761 | 30000 | 0.1358 | 0.3042 | 0.5801 |
5.3640 | 30500 | 0.1379 | 0.3042 | 0.5801 |
5.4519 | 31000 | 0.14 | 0.3042 | 0.5801 |
5.5399 | 31500 | 0.1421 | 0.3042 | 0.5801 |
5.6278 | 32000 | 0.1442 | 0.3042 | 0.5801 |
5.7157 | 32500 | 0.1463 | 0.3042 | 0.5801 |
5.8037 | 33000 | 0.1484 | 0.3042 | 0.5801 |
5.8916 | 33500 | 0.1505 | 0.3042 | 0.5801 |
5.9795 | 34000 | 0.1526 | 0.3042 | 0.5801 |
6.0675 | 34500 | 0.1547 | 0.3042 | 0.5801 |
6.1554 | 35000 | 0.1568 | 0.3042 | 0.5801 |
6.2433 | 35500 | 0.1589 | 0.3042 | 0.5801 |
6.3313 | 36000 | 0.161 | 0.3042 | 0.5801 |
6.4192 | 36500 | 0.1631 | 0.3042 | 0.5801 |
6.5071 | 37000 | 0.1652 | 0.3042 | 0.5801 |
6.5951 | 37500 | 0.1673 | 0.3042 | 0.5801 |
6.6830 | 38000 | 0.1694 | 0.3042 | 0.5801 |
6.7709 | 38500 | 0.1715 | 0.3042 | 0.5801 |
6.8589 | 39000 | 0.1736 | 0.3042 | 0.5801 |
6.9468 | 39500 | 0.1757 | 0.3042 | 0.5801 |
7.0347 | 40000 | 0.1778 | 0.3042 | 0.5801 |
7.1227 | 40500 | 0.1799 | 0.3042 | 0.5801 |
7.2106 | 41000 | 0.182 | 0.3042 | 0.5801 |
7.2985 | 41500 | 0.1841 | 0.3042 | 0.5801 |
7.3865 | 42000 | 0.1862 | 0.3042 | 0.5801 |
7.4744 | 42500 | 0.1883 | 0.3042 | 0.5801 |
7.5623 | 43000 | 0.1904 | 0.3042 | 0.5801 |
7.6503 | 43500 | 0.1925 | 0.3042 | 0.5801 |
7.7382 | 44000 | 0.1946 | 0.3042 | 0.5801 |
7.8261 | 44500 | 0.1967 | 0.3042 | 0.5801 |
7.9141 | 45000 | 0.1988 | 0.3042 | 0.5801 |
8.0020 | 45500 | 0.2009 | 0.3042 | 0.5801 |
8.0899 | 46000 | 0.203 | 0.3042 | 0.5801 |
8.1779 | 46500 | 0.2051 | 0.3042 | 0.5801 |
8.2658 | 47000 | 0.2072 | 0.3042 | 0.5801 |
8.3537 | 47500 | 0.2093 | 0.3042 | 0.5801 |
8.4417 | 48000 | 0.2114 | 0.3042 | 0.5801 |
8.5296 | 48500 | 0.2135 | 0.3042 | 0.5801 |
8.6175 | 49000 | 0.2156 | 0.3042 | 0.5801 |
8.7055 | 49500 | 0.2177 | 0.3042 | 0.5801 |
8.7934 | 50000 | 0.2198 | 0.3042 | 0.5801 |
📄 License
This project is licensed under the Apache-2.0 license.
Distilbert Base Uncased Finetuned Sst 2 English
Apache-2.0
Text classification model fine-tuned on the SST-2 sentiment analysis dataset based on DistilBERT-base-uncased, with 91.3% accuracy
Text Classification English
D
distilbert
5.2M
746
Xlm Roberta Base Language Detection
MIT
Multilingual detection model based on XLM-RoBERTa, supporting text classification in 20 languages
Text Classification
Transformers Supports Multiple Languages

X
papluca
2.7M
333
Roberta Hate Speech Dynabench R4 Target
This model improves online hate detection through dynamic dataset generation, focusing on learning from worst-case scenarios to enhance detection effectiveness.
Text Classification
Transformers English

R
facebook
2.0M
80
Bert Base Multilingual Uncased Sentiment
MIT
A multilingual sentiment analysis model fine-tuned based on bert-base-multilingual-uncased, supporting sentiment analysis of product reviews in 6 languages
Text Classification Supports Multiple Languages
B
nlptown
1.8M
371
Emotion English Distilroberta Base
A fine-tuned English text emotion classification model based on DistilRoBERTa-base, capable of predicting Ekman's six basic emotions and neutral category.
Text Classification
Transformers English

E
j-hartmann
1.1M
402
Robertuito Sentiment Analysis
Spanish tweet sentiment analysis model based on RoBERTuito, supporting POS(positive)/NEG(negative)/NEU(neutral) three-class sentiment classification
Text Classification Spanish
R
pysentimiento
1.0M
88
Finbert Tone
FinBERT is a BERT model pre-trained on financial communication texts, specializing in the field of financial natural language processing. finbert-tone is its fine-tuned version for financial sentiment analysis tasks.
Text Classification
Transformers English

F
yiyanghkust
998.46k
178
Roberta Base Go Emotions
MIT
A multi-label sentiment classification model based on RoBERTa-base, trained on the go_emotions dataset, supporting recognition of 28 emotion labels.
Text Classification
Transformers English

R
SamLowe
848.12k
565
Xlm Emo T
XLM-EMO is a multilingual sentiment analysis model fine-tuned based on the XLM-T model, supporting 19 languages and specifically designed for sentiment prediction in social media texts.
Text Classification
Transformers Other

X
MilaNLProc
692.30k
7
Deberta V3 Base Mnli Fever Anli
MIT
DeBERTa-v3 model trained on MultiNLI, Fever-NLI, and ANLI datasets, excelling in zero-shot classification and natural language inference tasks
Text Classification
Transformers English

D
MoritzLaurer
613.93k
204
Featured Recommended AI Models