模型概述
模型特點
模型能力
使用案例
🚀 MiniLM-L12-H384-uncased在GooAQ上訓練的模型
這是一個基於Cross Encoder的模型,它使用sentence-transformers庫從microsoft/MiniLM-L12-H384-uncased微調而來。該模型可以計算文本對的得分,可用於文本重排序和語義搜索。
🚀 快速開始
本模型可用於計算文本對的得分,適用於文本重排序和語義搜索任務。以下是使用該模型的基本步驟:
- 安裝
Sentence Transformers
庫。 - 加載模型並進行推理。
✨ 主要特性
- 基於
Cross Encoder
架構,能夠有效計算文本對的相關性得分。 - 微調自
microsoft/MiniLM-L12-H384-uncased
模型,具有較好的性能。 - 支持最大序列長度為512個標記。
- 輸出單個標籤得分,可用於文本重排序和語義搜索。
📦 安裝指南
首先,你需要安裝Sentence Transformers
庫:
pip install -U sentence-transformers
💻 使用示例
基礎用法
from sentence_transformers import CrossEncoder
# 從🤗 Hub下載模型
model = CrossEncoder("tomaarsen/reranker-MiniLM-L12-gooaq-bce")
# 獲取文本對的得分
pairs = [
['what is the remote desktop connection broker?', 'A remote desktop connection broker is software that allows clients to access various types of server-hosted desktops and applications. ... Load balancing the servers that host the desktops. Managing desktop images. Redirecting multimedia processing to the client.'],
['what is the remote desktop connection broker?', 'Remote Desktop Connection (RDC, also called Remote Desktop, formerly Microsoft Terminal Services Client, mstsc or tsclient) is the client application for RDS. It allows a user to remotely log into a networked computer running the terminal services server.'],
['what is the remote desktop connection broker?', "['Click the Start menu on your PC and search for Remote Desktop Connection.', 'Launch Remote Desktop Connection and click on Show Options.', 'Select the Local Resources tab and click More.', 'Under Drives, check the box for your C: drive or the drives that contain the files you will transfer and click OK.']"],
['what is the remote desktop connection broker?', "['Press the MENU button on your remote.', 'Select Parental Favs & Setup > System Setup > Remote or Remote Control > Program Remote.', 'Choose the device (TV, audio, DVD) you want to program. ... ', 'Follow the on-screen instructions to finish programming your remote.']"],
['what is the remote desktop connection broker?', "['Press MENU on your remote.', 'Select Parental Favs & Setup > System Setup > Remote or Remote Control > Program Remote.', 'Choose the device (TV, audio, DVD) you wish to program. ... ', 'Follow the on-screen prompts to complete the programming.']"],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)
# 或者根據與單個文本的相似度對不同文本進行排序
ranks = model.rank(
'what is the remote desktop connection broker?',
[
'A remote desktop connection broker is software that allows clients to access various types of server-hosted desktops and applications. ... Load balancing the servers that host the desktops. Managing desktop images. Redirecting multimedia processing to the client.',
'Remote Desktop Connection (RDC, also called Remote Desktop, formerly Microsoft Terminal Services Client, mstsc or tsclient) is the client application for RDS. It allows a user to remotely log into a networked computer running the terminal services server.',
"['Click the Start menu on your PC and search for Remote Desktop Connection.', 'Launch Remote Desktop Connection and click on Show Options.', 'Select the Local Resources tab and click More.', 'Under Drives, check the box for your C: drive or the drives that contain the files you will transfer and click OK.']",
"['Press the MENU button on your remote.', 'Select Parental Favs & Setup > System Setup > Remote or Remote Control > Program Remote.', 'Choose the device (TV, audio, DVD) you want to program. ... ', 'Follow the on-screen instructions to finish programming your remote.']",
"['Press MENU on your remote.', 'Select Parental Favs & Setup > System Setup > Remote or Remote Control > Program Remote.', 'Choose the device (TV, audio, DVD) you wish to program. ... ', 'Follow the on-screen prompts to complete the programming.']",
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
📚 詳細文檔
模型詳情
模型描述
屬性 | 詳情 |
---|---|
模型類型 | Cross Encoder |
基礎模型 | microsoft/MiniLM-L12-H384-uncased |
最大序列長度 | 512個標記 |
輸出標籤數量 | 1個標籤 |
語言 | 英語 |
許可證 | apache-2.0 |
模型來源
- 文檔:Sentence Transformers文檔
- 文檔:Cross Encoder文檔
- 倉庫:GitHub上的Sentence Transformers
- Hugging Face:Hugging Face上的Cross Encoders
評估
交叉編碼器重排序(gooaq-dev
數據集)
使用CrossEncoderRerankingEvaluator
進行評估,參數如下:
{
"at_k": 10,
"always_rerank_positives": false
}
指標 | 值 |
---|---|
map | 0.6856 (+0.1545) |
mrr@10 | 0.6830 (+0.1591) |
ndcg@10 | 0.7314 (+0.1402) |
交叉編碼器重排序(NanoMSMARCO_R100
, NanoNFCorpus_R100
和NanoNQ_R100
數據集)
使用CrossEncoderRerankingEvaluator
進行評估,參數如下:
{
"at_k": 10,
"always_rerank_positives": true
}
指標 | NanoMSMARCO_R100 | NanoNFCorpus_R100 | NanoNQ_R100 |
---|---|---|---|
map | 0.4320 (-0.0576) | 0.3503 (+0.0894) | 0.5234 (+0.1038) |
mrr@10 | 0.4205 (-0.0570) | 0.5706 (+0.0708) | 0.5284 (+0.1018) |
ndcg@10 | 0.5022 (-0.0382) | 0.3846 (+0.0596) | 0.5854 (+0.0847) |
交叉編碼器Nano BEIR(NanoBEIR_R100_mean
數據集)
使用CrossEncoderNanoBEIREvaluator
進行評估,參數如下:
{
"dataset_names": [
"msmarco",
"nfcorpus",
"nq"
],
"rerank_k": 100,
"at_k": 10,
"always_rerank_positives": true
}
指標 | 值 |
---|---|
map | 0.4353 (+0.0452) |
mrr@10 | 0.5065 (+0.0385) |
ndcg@10 | 0.4907 (+0.0354) |
訓練詳情
訓練數據集
未命名數據集
-
大小:578,402個訓練樣本
-
列:
question
,answer
和label
-
基於前1000個樣本的近似統計信息: | | 問題 | 答案 | 標籤 | |------|------|------|------| | 類型 | 字符串 | 字符串 | 整數 | | 詳情 |
- 最小長度: 18個字符
- 平均長度: 42.66個字符
- 最大長度: 73個字符
- 最小長度: 51個字符
- 平均長度: 252.61個字符
- 最大長度: 368個字符
- 0: ~82.90%
- 1: ~17.10%
-
樣本: | 問題 | 答案 | 標籤 | |------|------|------| |
what is the remote desktop connection broker?
|A remote desktop connection broker is software that allows clients to access various types of server-hosted desktops and applications. ... Load balancing the servers that host the desktops. Managing desktop images. Redirecting multimedia processing to the client.
|1
| |what is the remote desktop connection broker?
|Remote Desktop Connection (RDC, also called Remote Desktop, formerly Microsoft Terminal Services Client, mstsc or tsclient) is the client application for RDS. It allows a user to remotely log into a networked computer running the terminal services server.
|0
| |what is the remote desktop connection broker?
|['Click the Start menu on your PC and search for Remote Desktop Connection.', 'Launch Remote Desktop Connection and click on Show Options.', 'Select the Local Resources tab and click More.', 'Under Drives, check the box for your C: drive or the drives that contain the files you will transfer and click OK.']
|0
| -
損失函數:使用
BinaryCrossEntropyLoss
,參數如下:
{
"activation_fn": "torch.nn.modules.linear.Identity",
"pos_weight": 5
}
訓練超參數
非默認超參數
eval_strategy
: stepsper_device_train_batch_size
: 64per_device_eval_batch_size
: 64learning_rate
: 2e-05num_train_epochs
: 1warmup_ratio
: 0.1seed
: 12bf16
: Truedataloader_num_workers
: 4load_best_model_at_end
: True
所有超參數
點擊展開
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 64per_device_eval_batch_size
: 64per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 12data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 4dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
訓練日誌
輪次 | 步數 | 訓練損失 | gooaq-dev_ndcg@10 | NanoMSMARCO_R100_ndcg@10 | NanoNFCorpus_R100_ndcg@10 | NanoNQ_R100_ndcg@10 | NanoBEIR_R100_mean_ndcg@10 |
---|---|---|---|---|---|---|---|
-1 | -1 | - | 0.1548 (-0.4365) | 0.0475 (-0.4929) | 0.2762 (-0.0489) | 0.0485 (-0.4521) | 0.1241 (-0.3313) |
0.0001 | 1 | 1.0439 | - | - | - | - | - |
0.0221 | 200 | 1.1645 | - | - | - | - | - |
0.0443 | 400 | 1.0837 | - | - | - | - | - |
0.0664 | 600 | 0.8732 | - | - | - | - | - |
0.0885 | 800 | 0.7901 | - | - | - | - | - |
0.1106 | 1000 | 0.755 | 0.6710 (+0.0798) | 0.5150 (-0.0254) | 0.3164 (-0.0086) | 0.6085 (+0.1079) | 0.4800 (+0.0246) |
0.1328 | 1200 | 0.7095 | - | - | - | - | - |
0.1549 | 1400 | 0.7094 | - | - | - | - | - |
0.1770 | 1600 | 0.6715 | - | - | - | - | - |
0.1992 | 1800 | 0.6583 | - | - | - | - | - |
0.2213 | 2000 | 0.6865 | 0.6994 (+0.1082) | 0.5033 (-0.0372) | 0.3608 (+0.0357) | 0.6058 (+0.1052) | 0.4900 (+0.0346) |
0.2434 | 2200 | 0.6392 | - | - | - | - | - |
0.2655 | 2400 | 0.6403 | - | - | - | - | - |
0.2877 | 2600 | 0.6538 | - | - | - | - | - |
0.3098 | 2800 | 0.6273 | - | - | - | - | - |
0.3319 | 3000 | 0.6091 | 0.7033 (+0.1121) | 0.4779 (-0.0625) | 0.3369 (+0.0119) | 0.5859 (+0.0852) | 0.4669 (+0.0115) |
0.3541 | 3200 | 0.6244 | - | - | - | - | - |
0.3762 | 3400 | 0.6246 | - | - | - | - | - |
0.3983 | 3600 | 0.6222 | - | - | - | - | - |
0.4204 | 3800 | 0.5986 | - | - | - | - | - |
0.4426 | 4000 | 0.622 | 0.7252 (+0.1339) | 0.5538 (+0.0133) | 0.3718 (+0.0468) | 0.5965 (+0.0959) | 0.5074 (+0.0520) |
0.4647 | 4200 | 0.5742 | - | - | - | - | - |
0.4868 | 4400 | 0.6171 | - | - | - | - | - |
0.5090 | 4600 | 0.6023 | - | - | - | - | - |
0.5311 | 4800 | 0.5988 | - | - | - | - | - |
0.5532 | 5000 | 0.5693 | 0.7248 (+0.1336) | 0.5174 (-0.0231) | 0.3631 (+0.0381) | 0.5575 (+0.0569) | 0.4793 (+0.0240) |
0.5753 | 5200 | 0.5783 | - | - | - | - | - |
0.5975 | 5400 | 0.5866 | - | - | - | - | - |
0.6196 | 5600 | 0.543 | - | - | - | - | - |
0.6417 | 5800 | 0.57 | - | - | - | - | - |
0.6639 | 6000 | 0.5662 | 0.7273 (+0.1361) | 0.5148 (-0.0256) | 0.3644 (+0.0393) | 0.5754 (+0.0748) | 0.4849 (+0.0295) |
0.6860 | 6200 | 0.5605 | - | - | - | - | - |
0.7081 | 6400 | 0.5836 | - | - | - | - | - |
0.7303 | 6600 | 0.5703 | - | - | - | - | - |
0.7524 | 6800 | 0.5732 | - | - | - | - | - |
0.7745 | 7000 | 0.5679 | 0.7306 (+0.1394) | 0.5185 (-0.0219) | 0.3767 (+0.0517) | 0.5826 (+0.0820) | 0.4926 (+0.0372) |
0.7966 | 7200 | 0.5454 | - | - | - | - | - |
0.8188 | 7400 | 0.5471 | - | - | - | - | - |
0.8409 | 7600 | 0.5592 | - | - | - | - | - |
0.8630 | 7800 | 0.5545 | - | - | - | - | - |
0.8852 | 8000 | 0.5477 | 0.7314 (+0.1402) | 0.5022 (-0.0382) | 0.3846 (+0.0596) | 0.5854 (+0.0847) | 0.4907 (+0.0354) |
0.9073 | 8200 | 0.5411 | - | - | - | - | - |
0.9294 | 8400 | 0.5299 | - | - | - | - | - |
0.9515 | 8600 | 0.5677 | - | - | - | - | - |
0.9737 | 8800 | 0.5202 | - | - | - | - | - |
0.9958 | 9000 | 0.5211 | 0.7311 (+0.1399) | 0.5090 (-0.0315) | 0.3735 (+0.0484) | 0.5923 (+0.0916) | 0.4916 (+0.0362) |
-1 | -1 | - | 0.7314 (+0.1402) | 0.5022 (-0.0382) | 0.3846 (+0.0596) | 0.5854 (+0.0847) | 0.4907 (+0.0354) |
加粗行表示保存的檢查點。
環境影響
使用CodeCarbon測量碳排放:
- 能源消耗:0.143 kWh
- 碳排放:0.056 kg的CO2
- 使用時長:0.391小時
訓練硬件
- 是否使用雲服務:否
- GPU型號:1 x NVIDIA GeForce RTX 3090
- CPU型號:13th Gen Intel(R) Core(TM) i7-13700K
- 內存大小:31.78 GB
框架版本
- Python: 3.11.6
- Sentence Transformers: 3.5.0.dev0
- Transformers: 4.49.0
- PyTorch: 2.6.0+cu124
- Accelerate: 1.5.1
- Datasets: 3.3.2
- Tokenizers: 0.21.0
🔧 技術細節
本模型基於Cross Encoder
架構,通過微調microsoft/MiniLM-L12-H384-uncased
模型得到。在訓練過程中,使用了BinaryCrossEntropyLoss
作為損失函數,並採用了一系列超參數進行優化。模型的輸入為文本對,輸出為一個得分,表示文本對的相關性。
📄 許可證
本模型使用apache-2.0
許可證。
📚 引用
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}







