模型简介
模型特点
模型能力
使用案例
🚀 MiniLM-L12-H384-uncased在GooAQ上训练的模型
这是一个基于Cross Encoder的模型,它使用sentence-transformers库从microsoft/MiniLM-L12-H384-uncased微调而来。该模型可以计算文本对的得分,可用于文本重排序和语义搜索。
🚀 快速开始
本模型可用于计算文本对的得分,适用于文本重排序和语义搜索任务。以下是使用该模型的基本步骤:
- 安装
Sentence Transformers
库。 - 加载模型并进行推理。
✨ 主要特性
- 基于
Cross Encoder
架构,能够有效计算文本对的相关性得分。 - 微调自
microsoft/MiniLM-L12-H384-uncased
模型,具有较好的性能。 - 支持最大序列长度为512个标记。
- 输出单个标签得分,可用于文本重排序和语义搜索。
📦 安装指南
首先,你需要安装Sentence Transformers
库:
pip install -U sentence-transformers
💻 使用示例
基础用法
from sentence_transformers import CrossEncoder
# 从🤗 Hub下载模型
model = CrossEncoder("tomaarsen/reranker-MiniLM-L12-gooaq-bce")
# 获取文本对的得分
pairs = [
['what is the remote desktop connection broker?', 'A remote desktop connection broker is software that allows clients to access various types of server-hosted desktops and applications. ... Load balancing the servers that host the desktops. Managing desktop images. Redirecting multimedia processing to the client.'],
['what is the remote desktop connection broker?', 'Remote Desktop Connection (RDC, also called Remote Desktop, formerly Microsoft Terminal Services Client, mstsc or tsclient) is the client application for RDS. It allows a user to remotely log into a networked computer running the terminal services server.'],
['what is the remote desktop connection broker?', "['Click the Start menu on your PC and search for Remote Desktop Connection.', 'Launch Remote Desktop Connection and click on Show Options.', 'Select the Local Resources tab and click More.', 'Under Drives, check the box for your C: drive or the drives that contain the files you will transfer and click OK.']"],
['what is the remote desktop connection broker?', "['Press the MENU button on your remote.', 'Select Parental Favs & Setup > System Setup > Remote or Remote Control > Program Remote.', 'Choose the device (TV, audio, DVD) you want to program. ... ', 'Follow the on-screen instructions to finish programming your remote.']"],
['what is the remote desktop connection broker?', "['Press MENU on your remote.', 'Select Parental Favs & Setup > System Setup > Remote or Remote Control > Program Remote.', 'Choose the device (TV, audio, DVD) you wish to program. ... ', 'Follow the on-screen prompts to complete the programming.']"],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)
# 或者根据与单个文本的相似度对不同文本进行排序
ranks = model.rank(
'what is the remote desktop connection broker?',
[
'A remote desktop connection broker is software that allows clients to access various types of server-hosted desktops and applications. ... Load balancing the servers that host the desktops. Managing desktop images. Redirecting multimedia processing to the client.',
'Remote Desktop Connection (RDC, also called Remote Desktop, formerly Microsoft Terminal Services Client, mstsc or tsclient) is the client application for RDS. It allows a user to remotely log into a networked computer running the terminal services server.',
"['Click the Start menu on your PC and search for Remote Desktop Connection.', 'Launch Remote Desktop Connection and click on Show Options.', 'Select the Local Resources tab and click More.', 'Under Drives, check the box for your C: drive or the drives that contain the files you will transfer and click OK.']",
"['Press the MENU button on your remote.', 'Select Parental Favs & Setup > System Setup > Remote or Remote Control > Program Remote.', 'Choose the device (TV, audio, DVD) you want to program. ... ', 'Follow the on-screen instructions to finish programming your remote.']",
"['Press MENU on your remote.', 'Select Parental Favs & Setup > System Setup > Remote or Remote Control > Program Remote.', 'Choose the device (TV, audio, DVD) you wish to program. ... ', 'Follow the on-screen prompts to complete the programming.']",
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
📚 详细文档
模型详情
模型描述
属性 | 详情 |
---|---|
模型类型 | Cross Encoder |
基础模型 | microsoft/MiniLM-L12-H384-uncased |
最大序列长度 | 512个标记 |
输出标签数量 | 1个标签 |
语言 | 英语 |
许可证 | apache-2.0 |
模型来源
- 文档:Sentence Transformers文档
- 文档:Cross Encoder文档
- 仓库:GitHub上的Sentence Transformers
- Hugging Face:Hugging Face上的Cross Encoders
评估
交叉编码器重排序(gooaq-dev
数据集)
使用CrossEncoderRerankingEvaluator
进行评估,参数如下:
{
"at_k": 10,
"always_rerank_positives": false
}
指标 | 值 |
---|---|
map | 0.6856 (+0.1545) |
mrr@10 | 0.6830 (+0.1591) |
ndcg@10 | 0.7314 (+0.1402) |
交叉编码器重排序(NanoMSMARCO_R100
, NanoNFCorpus_R100
和NanoNQ_R100
数据集)
使用CrossEncoderRerankingEvaluator
进行评估,参数如下:
{
"at_k": 10,
"always_rerank_positives": true
}
指标 | NanoMSMARCO_R100 | NanoNFCorpus_R100 | NanoNQ_R100 |
---|---|---|---|
map | 0.4320 (-0.0576) | 0.3503 (+0.0894) | 0.5234 (+0.1038) |
mrr@10 | 0.4205 (-0.0570) | 0.5706 (+0.0708) | 0.5284 (+0.1018) |
ndcg@10 | 0.5022 (-0.0382) | 0.3846 (+0.0596) | 0.5854 (+0.0847) |
交叉编码器Nano BEIR(NanoBEIR_R100_mean
数据集)
使用CrossEncoderNanoBEIREvaluator
进行评估,参数如下:
{
"dataset_names": [
"msmarco",
"nfcorpus",
"nq"
],
"rerank_k": 100,
"at_k": 10,
"always_rerank_positives": true
}
指标 | 值 |
---|---|
map | 0.4353 (+0.0452) |
mrr@10 | 0.5065 (+0.0385) |
ndcg@10 | 0.4907 (+0.0354) |
训练详情
训练数据集
未命名数据集
-
大小:578,402个训练样本
-
列:
question
,answer
和label
-
基于前1000个样本的近似统计信息: | | 问题 | 答案 | 标签 | |------|------|------|------| | 类型 | 字符串 | 字符串 | 整数 | | 详情 |
- 最小长度: 18个字符
- 平均长度: 42.66个字符
- 最大长度: 73个字符
- 最小长度: 51个字符
- 平均长度: 252.61个字符
- 最大长度: 368个字符
- 0: ~82.90%
- 1: ~17.10%
-
样本: | 问题 | 答案 | 标签 | |------|------|------| |
what is the remote desktop connection broker?
|A remote desktop connection broker is software that allows clients to access various types of server-hosted desktops and applications. ... Load balancing the servers that host the desktops. Managing desktop images. Redirecting multimedia processing to the client.
|1
| |what is the remote desktop connection broker?
|Remote Desktop Connection (RDC, also called Remote Desktop, formerly Microsoft Terminal Services Client, mstsc or tsclient) is the client application for RDS. It allows a user to remotely log into a networked computer running the terminal services server.
|0
| |what is the remote desktop connection broker?
|['Click the Start menu on your PC and search for Remote Desktop Connection.', 'Launch Remote Desktop Connection and click on Show Options.', 'Select the Local Resources tab and click More.', 'Under Drives, check the box for your C: drive or the drives that contain the files you will transfer and click OK.']
|0
| -
损失函数:使用
BinaryCrossEntropyLoss
,参数如下:
{
"activation_fn": "torch.nn.modules.linear.Identity",
"pos_weight": 5
}
训练超参数
非默认超参数
eval_strategy
: stepsper_device_train_batch_size
: 64per_device_eval_batch_size
: 64learning_rate
: 2e-05num_train_epochs
: 1warmup_ratio
: 0.1seed
: 12bf16
: Truedataloader_num_workers
: 4load_best_model_at_end
: True
所有超参数
点击展开
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 64per_device_eval_batch_size
: 64per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 12data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 4dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
训练日志
轮次 | 步数 | 训练损失 | gooaq-dev_ndcg@10 | NanoMSMARCO_R100_ndcg@10 | NanoNFCorpus_R100_ndcg@10 | NanoNQ_R100_ndcg@10 | NanoBEIR_R100_mean_ndcg@10 |
---|---|---|---|---|---|---|---|
-1 | -1 | - | 0.1548 (-0.4365) | 0.0475 (-0.4929) | 0.2762 (-0.0489) | 0.0485 (-0.4521) | 0.1241 (-0.3313) |
0.0001 | 1 | 1.0439 | - | - | - | - | - |
0.0221 | 200 | 1.1645 | - | - | - | - | - |
0.0443 | 400 | 1.0837 | - | - | - | - | - |
0.0664 | 600 | 0.8732 | - | - | - | - | - |
0.0885 | 800 | 0.7901 | - | - | - | - | - |
0.1106 | 1000 | 0.755 | 0.6710 (+0.0798) | 0.5150 (-0.0254) | 0.3164 (-0.0086) | 0.6085 (+0.1079) | 0.4800 (+0.0246) |
0.1328 | 1200 | 0.7095 | - | - | - | - | - |
0.1549 | 1400 | 0.7094 | - | - | - | - | - |
0.1770 | 1600 | 0.6715 | - | - | - | - | - |
0.1992 | 1800 | 0.6583 | - | - | - | - | - |
0.2213 | 2000 | 0.6865 | 0.6994 (+0.1082) | 0.5033 (-0.0372) | 0.3608 (+0.0357) | 0.6058 (+0.1052) | 0.4900 (+0.0346) |
0.2434 | 2200 | 0.6392 | - | - | - | - | - |
0.2655 | 2400 | 0.6403 | - | - | - | - | - |
0.2877 | 2600 | 0.6538 | - | - | - | - | - |
0.3098 | 2800 | 0.6273 | - | - | - | - | - |
0.3319 | 3000 | 0.6091 | 0.7033 (+0.1121) | 0.4779 (-0.0625) | 0.3369 (+0.0119) | 0.5859 (+0.0852) | 0.4669 (+0.0115) |
0.3541 | 3200 | 0.6244 | - | - | - | - | - |
0.3762 | 3400 | 0.6246 | - | - | - | - | - |
0.3983 | 3600 | 0.6222 | - | - | - | - | - |
0.4204 | 3800 | 0.5986 | - | - | - | - | - |
0.4426 | 4000 | 0.622 | 0.7252 (+0.1339) | 0.5538 (+0.0133) | 0.3718 (+0.0468) | 0.5965 (+0.0959) | 0.5074 (+0.0520) |
0.4647 | 4200 | 0.5742 | - | - | - | - | - |
0.4868 | 4400 | 0.6171 | - | - | - | - | - |
0.5090 | 4600 | 0.6023 | - | - | - | - | - |
0.5311 | 4800 | 0.5988 | - | - | - | - | - |
0.5532 | 5000 | 0.5693 | 0.7248 (+0.1336) | 0.5174 (-0.0231) | 0.3631 (+0.0381) | 0.5575 (+0.0569) | 0.4793 (+0.0240) |
0.5753 | 5200 | 0.5783 | - | - | - | - | - |
0.5975 | 5400 | 0.5866 | - | - | - | - | - |
0.6196 | 5600 | 0.543 | - | - | - | - | - |
0.6417 | 5800 | 0.57 | - | - | - | - | - |
0.6639 | 6000 | 0.5662 | 0.7273 (+0.1361) | 0.5148 (-0.0256) | 0.3644 (+0.0393) | 0.5754 (+0.0748) | 0.4849 (+0.0295) |
0.6860 | 6200 | 0.5605 | - | - | - | - | - |
0.7081 | 6400 | 0.5836 | - | - | - | - | - |
0.7303 | 6600 | 0.5703 | - | - | - | - | - |
0.7524 | 6800 | 0.5732 | - | - | - | - | - |
0.7745 | 7000 | 0.5679 | 0.7306 (+0.1394) | 0.5185 (-0.0219) | 0.3767 (+0.0517) | 0.5826 (+0.0820) | 0.4926 (+0.0372) |
0.7966 | 7200 | 0.5454 | - | - | - | - | - |
0.8188 | 7400 | 0.5471 | - | - | - | - | - |
0.8409 | 7600 | 0.5592 | - | - | - | - | - |
0.8630 | 7800 | 0.5545 | - | - | - | - | - |
0.8852 | 8000 | 0.5477 | 0.7314 (+0.1402) | 0.5022 (-0.0382) | 0.3846 (+0.0596) | 0.5854 (+0.0847) | 0.4907 (+0.0354) |
0.9073 | 8200 | 0.5411 | - | - | - | - | - |
0.9294 | 8400 | 0.5299 | - | - | - | - | - |
0.9515 | 8600 | 0.5677 | - | - | - | - | - |
0.9737 | 8800 | 0.5202 | - | - | - | - | - |
0.9958 | 9000 | 0.5211 | 0.7311 (+0.1399) | 0.5090 (-0.0315) | 0.3735 (+0.0484) | 0.5923 (+0.0916) | 0.4916 (+0.0362) |
-1 | -1 | - | 0.7314 (+0.1402) | 0.5022 (-0.0382) | 0.3846 (+0.0596) | 0.5854 (+0.0847) | 0.4907 (+0.0354) |
加粗行表示保存的检查点。
环境影响
使用CodeCarbon测量碳排放:
- 能源消耗:0.143 kWh
- 碳排放:0.056 kg的CO2
- 使用时长:0.391小时
训练硬件
- 是否使用云服务:否
- GPU型号:1 x NVIDIA GeForce RTX 3090
- CPU型号:13th Gen Intel(R) Core(TM) i7-13700K
- 内存大小:31.78 GB
框架版本
- Python: 3.11.6
- Sentence Transformers: 3.5.0.dev0
- Transformers: 4.49.0
- PyTorch: 2.6.0+cu124
- Accelerate: 1.5.1
- Datasets: 3.3.2
- Tokenizers: 0.21.0
🔧 技术细节
本模型基于Cross Encoder
架构,通过微调microsoft/MiniLM-L12-H384-uncased
模型得到。在训练过程中,使用了BinaryCrossEntropyLoss
作为损失函数,并采用了一系列超参数进行优化。模型的输入为文本对,输出为一个得分,表示文本对的相关性。
📄 许可证
本模型使用apache-2.0
许可证。
📚 引用
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}







