Reranker Bert Tiny Gooaq Bce
This is a cross-encoder model fine-tuned from bert-tiny for calculating similarity scores between text pairs, suitable for various tasks such as semantic textual similarity and semantic search.
Downloads 37.19k
Release Time : 2/26/2025
Model Overview
This model is based on the BERT-tiny architecture and developed using the sentence-transformers library. It is primarily used to calculate similarity scores between text pairs, applicable to tasks like semantic textual similarity, semantic search, paraphrase mining, text classification, and clustering.
Model Features
Efficient and Lightweight
Based on the BERT-tiny architecture, the model is compact and computationally efficient.
Versatile for Multiple Tasks
Applicable to various tasks such as semantic textual similarity, semantic search, paraphrase mining, and text classification.
High Performance
Performs well on multiple evaluation datasets, especially achieving a map of 0.5677 on the GooAQ-dev dataset.
Model Capabilities
Calculate Text Similarity
Semantic Search
Text Classification
Text Clustering
Paraphrase Mining
Use Cases
Information Retrieval
Answer Reranking in QA Systems
Rerank candidate answers by relevance to improve QA system quality
Achieved a map of 0.5677 on the GooAQ-dev dataset.
Content Recommendation
Related Content Recommendation
Recommend related content based on user queries
🚀 BERT-tiny trained on GooAQ
This is a Cross Encoder model that computes scores for text pairs. It can be used for semantic textual similarity, search, paraphrase mining, classification, clustering, etc. Finetuned from prajjwal1/bert-tiny using the sentence-transformers library.
🚀 Quick Start
This is a Cross Encoder model finetuned from prajjwal1/bert-tiny using the sentence-transformers library. It computes scores for pairs of texts, which can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
This model was trained using train_script.py.
✨ Features
- Versatile Applications: Suitable for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
- Finetuned Model: Based on the prajjwal1/bert-tiny base model, finetuned to better handle specific tasks.
📦 Installation
First install the Sentence Transformers library:
pip install -U sentence-transformers
💻 Usage Examples
Basic Usage
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("cross-encoder-testing/reranker-bert-tiny-gooaq-bce")
# Get scores for pairs of texts
pairs = [
['are javascript developers in demand?', "JavaScript is the skill that is most in-demand for IT in 2020, according to a report from developer skills tester DevSkiller. The report, “Top IT Skills report 2020: Demand and Hiring Trends,” has JavaScript switching places with Java when compared to last year's report, with Java in third place this year, behind SQL."],
['are javascript developers in demand?', 'In one line difference between the two is: JavaScript is the programming language where as AngularJS is a framework based on JavaScript. ... It is also the basic for all java script based technologies like jquery, angular JS, bootstrap JS and so on. Angular JS is a framework written in javascript and uses MVC architecture.'],
['are javascript developers in demand?', 'Java applications are run in a virtual machine or web browser while JavaScript is run on a web browser. Java code is compiled whereas while JavaScript code is in text and in a web page. JavaScript is an OOP scripting language, whereas Java is an OOP programming language.'],
['are javascript developers in demand?', 'Things in the body tag are the things that should be displayed: the actual content. Javascript in the body is executed as it is read and as the page is rendered. Javascript in the head is interpreted before anything is rendered.'],
['are javascript developers in demand?', 'Web apps tend to be built using JavaScript, CSS and HTML5. Unlike mobile apps, there is no standard software development kit for building web apps. However, developers do have access to templates. Compared to mobile apps, web apps are usually quicker and easier to build — but they are much simpler in terms of features.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)
# Or rank different texts based on similarity to a single text
ranks = model.rank(
'are javascript developers in demand?',
[
"JavaScript is the skill that is most in-demand for IT in 2020, according to a report from developer skills tester DevSkiller. The report, “Top IT Skills report 2020: Demand and Hiring Trends,” has JavaScript switching places with Java when compared to last year's report, with Java in third place this year, behind SQL.",
'In one line difference between the two is: JavaScript is the programming language where as AngularJS is a framework based on JavaScript. ... It is also the basic for all java script based technologies like jquery, angular JS, bootstrap JS and so on. Angular JS is a framework written in javascript and uses MVC architecture.',
'Java applications are run in a virtual machine or web browser while JavaScript is run on a web browser. Java code is compiled whereas while JavaScript code is in text and in a web page. JavaScript is an OOP scripting language, whereas Java is an OOP programming language.',
'Things in the body tag are the things that should be displayed: the actual content. Javascript in the body is executed as it is read and as the page is rendered. Javascript in the head is interpreted before anything is rendered.',
'Web apps tend to be built using JavaScript, CSS and HTML5. Unlike mobile apps, there is no standard software development kit for building web apps. However, developers do have access to templates. Compared to mobile apps, web apps are usually quicker and easier to build — but they are much simpler in terms of features.',
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
📚 Documentation
Model Details
Model Description
Property | Details |
---|---|
Model Type | Cross Encoder |
Base model | prajjwal1/bert-tiny |
Maximum Sequence Length | 512 tokens |
Number of Output Labels | 1 label |
Language | en |
License | apache-2.0 |
Model Sources
- Documentation: Sentence Transformers Documentation
- Documentation: Cross Encoder Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Cross Encoders on Hugging Face
Evaluation
Metrics
Cross Encoder Reranking
- Datasets:
gooaq-dev
,NanoMSMARCO
,NanoNFCorpus
andNanoNQ
- Evaluated with
CrossEncoderRerankingEvaluator
Metric | gooaq-dev | NanoMSMARCO | NanoNFCorpus | NanoNQ |
---|---|---|---|---|
map | 0.5677 (+0.0366) | 0.4280 (-0.0616) | 0.3397 (+0.0787) | 0.4149 (-0.0047) |
mrr@10 | 0.5558 (+0.0318) | 0.4129 (-0.0646) | 0.5196 (+0.0198) | 0.4132 (-0.0135) |
ndcg@10 | 0.6157 (+0.0245) | 0.4772 (-0.0632) | 0.3308 (+0.0058) | 0.4859 (-0.0147) |
Cross Encoder Nano BEIR
- Dataset:
NanoBEIR_R100_mean
- Evaluated with
CrossEncoderNanoBEIREvaluator
Metric | Value |
---|---|
map | 0.3942 (+0.0041) |
mrr@10 | 0.4486 (-0.0194) |
ndcg@10 | 0.4313 (-0.0241) |
Training Details
Training Dataset
Unnamed Dataset
- Size: 578,402 training samples
- Columns:
question
,answer
, andlabel
- Approximate statistics based on the first 1000 samples:
question answer label type string string int details - min: 21 characters
- mean: 43.81 characters
- max: 96 characters
- min: 51 characters
- mean: 252.46 characters
- max: 405 characters
- 0: ~82.90%
- 1: ~17.10%
- Samples:
question answer label are javascript developers in demand?
JavaScript is the skill that is most in-demand for IT in 2020, according to a report from developer skills tester DevSkiller. The report, “Top IT Skills report 2020: Demand and Hiring Trends,” has JavaScript switching places with Java when compared to last year's report, with Java in third place this year, behind SQL.
1
are javascript developers in demand?
In one line difference between the two is: JavaScript is the programming language where as AngularJS is a framework based on JavaScript. ... It is also the basic for all java script based technologies like jquery, angular JS, bootstrap JS and so on. Angular JS is a framework written in javascript and uses MVC architecture.
0
are javascript developers in demand?
Java applications are run in a virtual machine or web browser while JavaScript is run on a web browser. Java code is compiled whereas while JavaScript code is in text and in a web page. JavaScript is an OOP scripting language, whereas Java is an OOP programming language.
0
- Loss:
BinaryCrossEntropyLoss
with these parameters:{ "activation_fct": "torch.nn.modules.linear.Identity", "pos_weight": 5 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 2048per_device_eval_batch_size
: 2048learning_rate
: 0.0005num_train_epochs
: 1warmup_ratio
: 0.1seed
: 12bf16
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 2048per_device_eval_batch_size
: 2048per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 0.0005weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 12data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | gooaq-dev_ndcg@10 | NanoMSMARCO_ndcg@10 | NanoNFCorpus_ndcg@10 | NanoNQ_ndcg@10 | NanoBEIR_R100_mean_ndcg@10 |
---|---|---|---|---|---|---|---|
-1 | -1 | - | 0.0887 (-0.5025) | 0.0063 (-0.5341) | 0.3262 (+0.0012) | 0.0000 (-0.5006) | 0.1108 (-0.3445) |
0.0035 | 1 | 1.1945 | - | - | - | - | - |
0.0707 | 20 | 1.1664 | 0.4082 (-0.1830) | 0.1805 (-0.3600) | 0.3168 (-0.0083) | 0.2243 (-0.2763) | 0.2405 (-0.2149) |
0.1413 | 40 | 1.1107 | 0.5260 (-0.0652) | 0.3453 (-0.1951) | 0.3335 (+0.0085) | 0.3430 (-0.1576) | 0.3406 (-0.1147) |
0.2120 | 60 | 1.022 | 0.5623 (-0.0289) | 0.3929 (-0.1475) | 0.3512 (+0.0262) | 0.3472 (-0.1535) | 0.3638 (-0.0916) |
0.2827 | 80 | 0.973 | 0.5691 (-0.0221) | 0.4048 (-0.1356) | 0.3530 (+0.0280) | 0.3833 (-0.1174) | 0.3804 (-0.0750) |
0.3534 | 100 | 0.963 | 0.5814 (-0.0098) | 0.4385 (-0.1019) | 0.3471 (+0.0221) | 0.4227 (-0.0779) | 0.4028 (-0.0526) |
0.4240 | 120 | 0.9419 | 0.5963 (+0.0050) | 0.4106 (-0.1298) | 0.3540 (+0.0289) | 0.4843 (-0.0163) | 0.4163 (-0.0391) |
0.4947 | 140 | 0.9331 | 0.5953 (+0.0041) | 0.4310 (-0.1094) | 0.3367 (+0.0117) | 0.4163 (-0.0843) | 0.3947 (-0.0607) |
0.5654 | 160 | 0.9263 | 0.6070 (+0.0158) | 0.4626 (-0.0778) | 0.3443 (+0.0193) | 0.4823 (-0.0184) | 0.4297 (-0.0256) |
0.6360 | 180 | 0.9212 | 0.6069 (+0.0156) | 0.4602 (-0.0802) | 0.3391 (+0.0141) | 0.4782 (-0.0224) | 0.4258 (-0.0295) |
0.7067 | 200 | 0.901 | 0.6126 (+0.0214) | 0.4602 (-0.0803) | 0.3413 (+0.0162) | 0.4780 (-0.0227) | 0.4265 (-0.0289) |
0.7774 | 220 | 0.8997 | 0.6136 (+0.0224) | 0.4801 (-0.0604) | 0.3349 (+0.0098) | 0.4903 (-0.0103) | 0.4351 (-0.0203) |
0.8481 | 240 | 0.9021 | 0.6132 (+0.0220) | 0.4850 (-0.0554) | 0.3438 (+0.0188) | 0.4855 (-0.0151) | 0.4381 (-0.0173) |
0.9187 | 260 | 0.9013 | 0.6188 (+0.0276) | 0.4820 (-0.0584) | 0.3387 (+0.0137) | 0.4851 (-0.0156) | 0.4353 (-0.0201) |
0.9894 | 280 | 0.8996 | 0.6157 (+0.0245) | 0.4772 (-0.0632) | 0.3305 (+0.0054) | 0.4859 (-0.0147) | 0.4312 (-0.0242) |
-1 | -1 | - | 0.6157 (+0.0245) | 0.4772 (-0.0632) | 0.3308 (+0.0058) | 0.4859 (-0.0147) | 0.4313 (-0.0241) |
Environmental Impact
Carbon emissions were measured using CodeCarbon.
- Energy Consumed: 0.019 kWh
- Carbon Emitted: 0.007 kg of CO2
- Hours Used: 0.099 hours
Training Hardware
- On Cloud: No
- GPU Model: 1 x NVIDIA GeForce RTX 3090
- CPU Model: 13th Gen Intel(R) Core(TM) i7-13700K
- RAM Size: 31.78 GB
Framework Versions
- Python: 3.11.6
- Sentence Transformers: 3.5.0.dev0
- Transformers: 4.48.3
- PyTorch: 2.5.0+cu121
- Accelerate: 1.3.0
- Datasets: 2.20.0
- Tokenizers: 0.21.0
📄 License
This project is licensed under the Apache 2.0 license.
📄 Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Jina Embeddings V3
Jina Embeddings V3 is a multilingual sentence embedding model supporting over 100 languages, specializing in sentence similarity and feature extraction tasks.
Text Embedding
Transformers Supports Multiple Languages

J
jinaai
3.7M
911
Ms Marco MiniLM L6 V2
Apache-2.0
A cross-encoder model trained on the MS Marco passage ranking task for query-passage relevance scoring in information retrieval
Text Embedding English
M
cross-encoder
2.5M
86
Opensearch Neural Sparse Encoding Doc V2 Distill
Apache-2.0
A sparse retrieval model based on distillation technology, optimized for OpenSearch, supporting inference-free document encoding with improved search relevance and efficiency over V1
Text Embedding
Transformers English

O
opensearch-project
1.8M
7
Sapbert From PubMedBERT Fulltext
Apache-2.0
A biomedical entity representation model based on PubMedBERT, optimized for semantic relation capture through self-aligned pre-training
Text Embedding English
S
cambridgeltl
1.7M
49
Gte Large
MIT
GTE-Large is a powerful sentence transformer model focused on sentence similarity and text embedding tasks, excelling in multiple benchmark tests.
Text Embedding English
G
thenlper
1.5M
278
Gte Base En V1.5
Apache-2.0
GTE-base-en-v1.5 is an English sentence transformer model focused on sentence similarity tasks, excelling in multiple text embedding benchmarks.
Text Embedding
Transformers Supports Multiple Languages

G
Alibaba-NLP
1.5M
63
Gte Multilingual Base
Apache-2.0
GTE Multilingual Base is a multilingual sentence embedding model supporting over 50 languages, suitable for tasks like sentence similarity calculation.
Text Embedding
Transformers Supports Multiple Languages

G
Alibaba-NLP
1.2M
246
Polybert
polyBERT is a chemical language model designed to achieve fully machine-driven ultrafast polymer informatics. It maps PSMILES strings into 600-dimensional dense fingerprints to numerically represent polymer chemical structures.
Text Embedding
Transformers

P
kuelumbus
1.0M
5
Bert Base Turkish Cased Mean Nli Stsb Tr
Apache-2.0
A sentence embedding model based on Turkish BERT, optimized for semantic similarity tasks
Text Embedding
Transformers Other

B
emrecan
1.0M
40
GIST Small Embedding V0
MIT
A text embedding model fine-tuned based on BAAI/bge-small-en-v1.5, trained with the MEDI dataset and MTEB classification task datasets, optimized for query encoding in retrieval tasks.
Text Embedding
Safetensors English
G
avsolatorio
945.68k
29
Featured Recommended AI Models