Reason ModernColBERT
基于ReasonIR数据训练的延迟交互模型,在BRIGHT基准测试中表现出色,超越多个大型模型
Downloads 798
Release Time : 5/22/2025
Model Overview
这是一个基于lightonai/GTE-ModernColBERT-v1微调而来的PyLate模型,使用reasonir-hq数据集训练。它将句子和段落映射为128维密集向量序列,可用于语义文本相似性计算。
Model Features
延迟交互机制
采用延迟交互机制,相比密集检索模型在推理密集型检索任务中表现更优
高效性能
在BRIGHT基准测试中超越多个大型模型,包括比其大45倍的模型
多向量表示
将文本映射为128维密集向量序列,而非单一向量表示
Model Capabilities
语义文本相似性计算
信息检索
文档重排序
Use Cases
信息检索
专业领域检索
在生物学、地球科学等专业领域进行高效信息检索
在BRIGHT基准测试中多个领域表现优异
技术问答检索
针对Stack Overflow等技术问答平台的内容检索
在Stack Exchange分割测试中表现突出
文档处理
文档重排序
对初步检索结果进行精细化重排序
提供更相关的文档排序
🚀 Reason-ModernColBERT
Reason-ModernColBERT是一个后期交互模型,在reasonir-hq数据集上进行训练。该模型在BRIGHT基准测试中表现出色,该基准测试旨在评估推理密集型检索性能。Reason-ModernColBERT超越了所有规模达70亿参数的现有模型(其规模是该模型的45倍以上),甚至在Stack Exchange分割数据上,平均比ReasonIR-8B(在相同数据上训练的80亿参数模型)的NDCG@10提高了2.5以上。我们将如此出色的结果归功于后期交互,详见评估部分。
🚀 快速开始
由于原文档未提供快速开始的相关内容,此部分暂不展示。
✨ 主要特性
- 高性能:在BRIGHT基准测试中,超越了所有规模达70亿参数的现有模型,甚至在Stack Exchange分割数据上,平均比ReasonIR-8B的NDCG@10提高了2.5以上。
- 后期交互:通过后期交互机制,提升了推理密集型检索的性能。
📦 安装指南
首先安装PyLate库:
pip install -U pylate
💻 使用示例
基础用法
索引文档
from pylate import indexes, models, retrieve
# Step 1: Load the ColBERT model
model = models.ColBERT(
model_name_or_path=pylate_model_id,
)
# Step 2: Initialize the Voyager index
index = indexes.Voyager(
index_folder="pylate-index",
index_name="index",
override=True, # This overwrites the existing index if any
)
# Step 3: Encode the documents
documents_ids = ["1", "2", "3"]
documents = ["document 1 text", "document 2 text", "document 3 text"]
documents_embeddings = model.encode(
documents,
batch_size=32,
is_query=False, # Ensure that it is set to False to indicate that these are documents, not queries
show_progress_bar=True,
)
# Step 4: Add document embeddings to the index by providing embeddings and corresponding ids
index.add_documents(
documents_ids=documents_ids,
documents_embeddings=documents_embeddings,
)
注意,你不必每次都重新创建索引和编码文档。一旦你创建了一个索引并添加了文档,你可以通过加载它来重复使用该索引:
# To load an index, simply instantiate it with the correct folder/name and without overriding it
index = indexes.Voyager(
index_folder="pylate-index",
index_name="index",
)
检索前k个文档
# Step 1: Initialize the ColBERT retriever
retriever = retrieve.ColBERT(index=index)
# Step 2: Encode the queries
queries_embeddings = model.encode(
["query for document 3", "query for document 1"],
batch_size=32,
is_query=True, # # Ensure that it is set to False to indicate that these are queries
show_progress_bar=True,
)
# Step 3: Retrieve top-k documents
scores = retriever.retrieve(
queries_embeddings=queries_embeddings,
k=10, # Retrieve the top 10 matches for each query
)
高级用法
重排序
如果你只想使用ColBERT模型在你的第一阶段检索管道之上进行重排序,而不构建索引,你可以简单地使用rank函数并传入要重排序的查询和文档:
from pylate import rank, models
queries = [
"query A",
"query B",
]
documents = [
["document A", "document B"],
["document 1", "document C", "document B"],
]
documents_ids = [
[1, 2],
[1, 3, 2],
]
model = models.ColBERT(
model_name_or_path=pylate_model_id,
)
queries_embeddings = model.encode(
queries,
is_query=True,
)
documents_embeddings = model.encode(
documents,
is_query=False,
)
reranked_documents = rank.rerank(
documents_ids=documents_ids,
queries_embeddings=queries_embeddings,
documents_embeddings=documents_embeddings,
)
📚 详细文档
模型详情
模型描述
属性 | 详情 |
---|---|
模型类型 | PyLate模型 |
基础模型 | lightonai/GTE-ModernColBERT-v1 |
文档长度 | 8192个标记 |
查询长度 | 128个标记 |
输出维度 | 128个标记 |
相似度函数 | MaxSim |
训练数据集 | reasonir-hq |
语言 | 英语 |
模型来源
- 文档:PyLate文档
- 仓库:GitHub上的PyLate
- Hugging Face:Hugging Face上的PyLate模型
完整模型架构
ColBERT(
(0): Transformer({'max_seq_length': 127, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Dense({'in_features': 768, 'out_features': 128, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
)
训练详情
训练数据集
reasonir-hq
- 数据集:train at 0275f82
- 大小:100,521个训练样本
- 列:
query
、pos
和neg
- 基于前1000个样本的近似统计信息:
| | 查询 | 正样本 | 负样本 |
| ---- | ---- | ---- | ---- |
| 类型 | 字符串 | 字符串 | 字符串 |
| 详情 |
- 最小:38个标记
- 平均:97.84个标记
- 最大:128个标记
- 最小:85个标记
- 平均:127.63个标记
- 最大:128个标记
- 最小:81个标记
- 平均:127.77个标记
- 最大:128个标记
- 样本:
| 查询 | 正样本 | 负样本 |
| ---- | ---- | ---- |
|
Given this reasoning-intensive query, find relevant documents that could help answer the question. A researcher is analyzing a sound signal represented by the equation f(t) = 2sin(3πt) + sin(5πt) + 0.5sin(7πt). Using the Fourier transform, what are the frequencies, amplitudes, and phases of the individual sinusoidal components in the signal?
|A sound signal is given by the equation f(t) = sin(2πt) + sin(4πt) + sin(6πt) where t is time in seconds. Use Fourier transform to find the frequencies, amplitudes, and phases of the individual sinusoidal components in the signal.
|
To find the frequencies, amplitudes, and phases of the individual sinusoidal components in the signal f(t) = sin(2πt) + sin(4πt) + sin(6πt), we can use the Fourier transform. The Fourier transform of a continuous function f(t) is given by:
F(ω) = ∫[f(t) * e^(-jωt)] dt
where F(ω) is the Fourier transform of f(t), ω is the angular frequency, and j is the imaginary unit (j^2 = -1). In this case, f(t) is already given as a sum of sinusoidal functions, so we can directly identify the frequencies, amplitudes, and phases of the individual components.
1. First component: sin(2πt)
- Frequency: The angular frequency is 2π, so the frequency is ω/(2π) = 1 Hz.
- Amplitude: The coefficient of the sine function is 1, so the amplitude is 1.
- Phase: There is no phase shi...The Fourier transform is widely used in various fields, including engineering, physics, and data analysis. It is a powerful tool for decomposing a signal into its constituent frequencies. In music, for example, the Fourier transform can be used to analyze the frequency components of a sound wave. By applying the Fourier transform to a sound signal, one can identify the different frequencies present in the signal, as well as their relative amplitudes. This information can be useful in a variety of applications, such as sound filtering and audio processing. The Fourier transform can also be used to analyze images and other types of data. In image processing, the Fourier transform can be used to filter out noise and other unwanted features from an image. It can also be used to compress images by representing them in the frequency domain. In addition to its many practical applications, the Fourier transform also has a number of interesting theoretical properties. For example, it has been ...
| |Given this reasoning-intensive query, find relevant documents that could help answer the question. A manufacturer is designing a cone-shaped container with a fixed volume of 200π cubic centimeters. The container's height is 12 centimeters, and the radius of the base is unknown. If the manufacturer wants to minimize the surface area of the container while maintaining its volume, what should be the radius of the base?
|A right circular cone has a radius of 6cm and a slant height of 10cm. Determine the surface area of the cone.
|
To find the surface area of a right circular cone, we need to calculate the area of the base and the lateral surface area, and then add them together.
The base of the cone is a circle with radius r = 6 cm. The area of the base (A_base) can be found using the formula for the area of a circle:
A_base = πr^2
A_base = π(6 cm)^2
A_base = 36π cm^2
The lateral surface area (A_lateral) can be found using the formula for the lateral surface area of a cone:
A_lateral = πrs, where r is the radius and s is the slant height.
Given that the slant height s = 10 cm, we can calculate the lateral surface area:
A_lateral = π(6 cm)(10 cm)
A_lateral = 60π cm^2
Now, we can find the total surface area (A_total) by adding the base area and the lateral surface area:
A_total = A_base + A_lateral
A_total = 36π cm^2 + 60π cm^2
A_total = 96π cm^2
The surface area of the cone is 96π cm^2.Torus-Shaped Containers in Chemical Engineering - New Designs and ApplicationsTorus-shaped containers are commonly used in chemical engineering for storing and transporting fluids. These containers have a distinctive doughnut shape, with a central hole and a circular cross-section. In this article, we will explore the design and applications of torus-shaped containers in chemical engineering.One of the main advantages of torus-shaped containers is their high volume-to-surface-area ratio. This makes them ideal for storing large quantities of fluids while minimizing the amount of material needed for construction. Additionally, the curved shape of the container provides added strength and stability, making it less prone to rupture or leakage.The design of torus-shaped containers typically involves the use of computer-aided design (CAD) software to create detailed models of the container's geometry. Engineers can then use these models to simulate various scenarios, such as fluid flow and ...
| |Given this reasoning-intensive query, find relevant documents that could help answer the question. On the xy-coordinate plane, points A and B are given as A(2, 4) and B(8, -3). Determine the coordinates of the point on line segment AB that is three times as far from A as it is from B.
|On the xy co-ordinate plane, point C is (5,-2) and point D is (-1,1.5). The point on line segment CD that is twice as far from C as from D is:
|
Answer Choices: (A) (1,-1) (B) (1,1) (C) (2,0.25) (D) (3,0.5) (E) (3,1)
Let's think about the multi-choice question step by step.
We want the point on the line that is twice as far from C as it is from D. We can examine the x and y coordinates separately since they are independent.
*It should be noted that there are two solutions to this problem, one point between C and D, and another point with D in the middle of C and the point. We can quickly look at the answer choices and see that all the points are between C and D, therefore we can search for that point using the following method:
Taking the x-coordinate first, the distance between C and D is |(x-coordinate ofC - (x-coordinate ofD|= |5 - (-1)| = 6
The x-coordinate that is twice as far from C as it is from D (and in between C andD will be 4 units from C and 2 units from D. So the ...The concept of midpoint is often useful in various mathematical problems, but sometimes we need to find other points that divide a line segment in a particular ratio. One common scenario is when we need to find the point that divides the line segment in the ratio of the other two points. Let's consider an example to understand this better. Suppose we have two points E(3, 4) and F(7, -2) on the xy-coordinate plane, and we want to find the point G on the line segment EF such that EG:GF = 2:5. To solve this problem, we can use the concept of section formula, which states that if a point P(x, y) divides the line segment joining the points A(x1, y1) and B(x2, y2) in the ratio m:n, then the coordinates of P are ((mx2+nx1)/(m+n), (my2+ny1)/(m+n)). Using this formula, we can find the coordinates of point G. First, we need to find the difference in x-coordinates and y-coordinates of points E and F. The difference in x-coordinates is 7 - 3 = 4, and the difference in y-coordinates is -2 - 4 = -6...
| - 损失函数:
pylate.losses.cached_contrastive.CachedContrastive
训练超参数
非默认超参数
per_device_train_batch_size
:256per_device_eval_batch_size
:256learning_rate
:1e-05bf16
:Truedataloader_num_workers
:8
所有超参数
点击展开
overwrite_output_dir
:Falsedo_predict
:Falseeval_strategy
:noprediction_loss_only
:Trueper_device_train_batch_size
:256per_device_eval_batch_size
:256per_gpu_train_batch_size
:Noneper_gpu_eval_batch_size
:Nonegradient_accumulation_steps
:1eval_accumulation_steps
:Nonetorch_empty_cache_steps
:Nonelearning_rate
:1e-05weight_decay
:0.0adam_beta1
:0.9adam_beta2
:0.999adam_epsilon
:1e-08max_grad_norm
:1.0num_train_epochs
:3max_steps
:-1lr_scheduler_type
:linearlr_scheduler_kwargs
:{}warmup_ratio
:0.0warmup_steps
:0log_level
:passivelog_level_replica
:warninglog_on_each_node
:Truelogging_nan_inf_filter
:Truesave_safetensors
:Truesave_on_each_node
:Falsesave_only_model
:Falserestore_callback_states_from_checkpoint
:Falseno_cuda
:Falseuse_cpu
:Falseuse_mps_device
:Falseseed
:42data_seed
:Nonejit_mode_eval
:Falseuse_ipex
:Falsebf16
:Truefp16
:Falsefp16_opt_level
:O1half_precision_backend
:autobf16_full_eval
:Falsefp16_full_eval
:Falsetf32
:Nonelocal_rank
:0ddp_backend
:Nonetpu_num_cores
:Nonetpu_metrics_debug
:Falsedebug
:[]dataloader_drop_last
:Falsedataloader_num_workers
:8dataloader_prefetch_factor
:Nonepast_index
:-1disable_tqdm
:Falseremove_unused_columns
:Truelabel_names
:Noneload_best_model_at_end
:Falseignore_data_skip
:Falsefsdp
:[]fsdp_min_num_params
:0fsdp_config
:{'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
:Noneaccelerator_config
:{'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
:Nonelabel_smoothing_factor
:0.0optim
:adamw_torchoptim_args
:Noneadafactor
:Falsegroup_by_length
:Falselength_column_name
:lengthddp_find_unused_parameters
:Noneddp_bucket_cap_mb
:Noneddp_broadcast_buffers
:Falsedataloader_pin_memory
:Truedataloader_persistent_workers
:Falseskip_memory_metrics
:Trueuse_legacy_prediction_loop
:Falsepush_to_hub
:Falseresume_from_checkpoint
:Nonehub_model_id
:Nonehub_strategy
:every_savehub_private_repo
:Nonehub_always_push
:Falsegradient_checkpointing
:Falsegradient_checkpointing_kwargs
:Noneinclude_inputs_for_metrics
:Falseinclude_for_metrics
:[]eval_do_concat_batches
:Truefp16_backend
:autopush_to_hub_model_id
:Nonepush_to_hub_organization
:Nonemp_parameters
:auto_find_batch_size
:Falsefull_determinism
:Falsetorchdynamo
:Noneray_scope
:lastddp_timeout
:1800torch_compile
:Falsetorch_compile_backend
:Nonetorch_compile_mode
:Nonedispatch_batches
:Nonesplit_batches
:Noneinclude_tokens_per_second
:Falseinclude_num_input_tokens_seen
:Falseneftune_noise_alpha
:Noneoptim_target_modules
:Nonebatch_eval_metrics
:Falseeval_on_start
:Falseuse_liger_kernel
:Falseeval_use_gather_object
:Falseaverage_tokens_across_devices
:Falseprompts
:Nonebatch_sampler
:batch_samplermulti_dataset_batch_sampler
:proportional
训练日志
点击展开
轮次 | 步数 | 训练损失 |
---|---|---|
0.0025 | 1 | 4.9684 |
0.0051 | 2 | 4.6956 |
0.0076 | 3 | 4.5076 |
0.0102 | 4 | 4.3723 |
0.0127 | 5 | 4.3305 |
0.0153 | 6 | 4.0355 |
0.0178 | 7 | 3.7886 |
0.0204 | 8 | 3.6133 |
0.0229 | 9 | 3.2395 |
0.0254 | 10 | 3.1481 |
0.0280 | 11 | 2.7444 |
0.0305 | 12 | 2.4946 |
0.0331 | 13 | 2.333 |
0.0356 | 14 | 2.2471 |
0.0382 | 15 | 1.9117 |
0.0407 | 16 | 1.6753 |
0.0433 | 17 | 1.2413 |
0.0458 | 18 | 1.1201 |
0.0483 | 19 | 1.0335 |
0.0509 | 20 | 1.0583 |
0.0534 | 21 | 1.067 |
0.0560 | 22 | 0.7056 |
0.0585 | 23 | 0.761 |
0.0611 | 24 | 0.5501 |
0.0636 | 25 | 0.6486 |
0.0662 | 26 | 0.4639 |
0.0687 | 27 | 0.3885 |
0.0712 | 28 | 0.4982 |
0.0738 | 29 | 0.4784 |
0.0763 | 30 | 0.5189 |
0.0789 | 31 | 0.4824 |
0.0814 | 32 | 0.4183 |
0.0840 | 33 | 0.4945 |
0.0865 | 34 | 0.2579 |
0.0891 | 35 | 0.3312 |
0.0916 | 36 | 0.4035 |
0.0941 | 37 | 0.305 |
0.0967 | 38 | 0.2898 |
0.0992 | 39 | 0.2899 |
0.1018 | 40 | 0.2713 |
0.1043 | 41 | 0.3017 |
0.1069 | 42 | 0.2395 |
0.1094 | 43 | 0.1548 |
0.1120 | 44 | 0.2468 |
0.1145 | 45 | 0.1876 |
0.1170 | 46 | 0.2322 |
0.1196 | 47 | 0.2823 |
0.1221 | 48 | 0.2158 |
0.1247 | 49 | 0.2679 |
0.1272 | 50 | 0.273 |
0.1298 | 51 | 0.2876 |
0.1323 | 52 | 0.197 |
0.1349 | 53 | 0.1282 |
0.1374 | 54 | 0.3355 |
0.1399 | 55 | 0.1941 |
0.1425 | 56 | 0.1873 |
0.1450 | 57 | 0.2288 |
0.1476 | 58 | 0.2802 |
0.1501 | 59 | 0.2087 |
0.1527 | 60 | 0.2239 |
0.1552 | 61 | 0.225 |
0.1578 | 62 | 0.1582 |
0.1603 | 63 | 0.1972 |
0.1628 | 64 | 0.1632 |
0.1654 | 65 | 0.2101 |
0.1679 | 66 | 0.2084 |
0.1705 | 67 | 0.1499 |
0.1730 | 68 | 0.1467 |
0.1756 | 69 | 0.1428 |
0.1781 | 70 | 0.2298 |
0.1807 | 71 | 0.1883 |
0.1832 | 72 | 0.22 |
0.1858 | 73 | 0.1988 |
0.1883 | 74 | 0.2091 |
0.1908 | 75 | 0.1948 |
0.1934 | 76 | 0.1348 |
0.1959 | 77 | 0.112 |
0.1985 | 78 | 0.1474 |
0.2010 | 79 | 0.1949 |
0.2036 | 80 | 0.1664 |
0.2061 | 81 | 0.1807 |
0.2087 | 82 | 0.1403 |
0.2112 | 83 | 0.1225 |
0.2137 | 84 | 0.1919 |
0.2163 | 85 | 0.1403 |
0.2188 | 86 | 0.1402 |
0.2214 | 87 | 0.0981 |
0.2239 | 88 | 0.1214 |
0.2265 | 89 | 0.1755 |
0.2290 | 90 | 0.1509 |
0.2316 | 91 | 0.1551 |
0.2341 | 92 | 0.176 |
0.2366 | 93 | 0.1648 |
0.2392 | 94 | 0.1622 |
0.2417 | 95 | 0.1372 |
0.2443 | 96 | 0.1016 |
0.2468 | 97 | 0.1134 |
0.2494 | 98 | 0.1436 |
0.2519 | 99 | 0.1478 |
0.2545 | 100 | 0.2065 |
0.2570 | 101 | 0.1901 |
0.2595 | 102 | 0.1859 |
0.2621 | 103 | 0.212 |
0.2646 | 104 | 0.2179 |
0.2672 | 105 | 0.2471 |
0.2697 | 106 | 0.1769 |
0.2723 | 107 | 0.1593 |
0.2748 | 108 | 0.204 |
0.2774 | 109 | 0.1496 |
0.2799 | 110 | 0.1212 |
0.2824 | 111 | 0.1282 |
0.2850 | 112 | 0.1126 |
0.2875 | 113 | 0.1254 |
0.2901 | 114 | 0.1422 |
0.2926 | 115 | 0.1266 |
0.2952 | 116 | 0.1305 |
0.2977 | 117 | 0.1283 |
0.3003 | 118 | 0.0737 |
0.3028 | 119 | 0.1237 |
0.3053 | 120 | 0.1185 |
0.3079 | 121 | 0.0891 |
0.3104 | 122 | 0.2312 |
0.3130 | 123 | 0.2384 |
0.3155 | 124 | 0.155 |
0.3181 | 125 | 0.1118 |
0.3206 | 126 | 0.1575 |
0.3232 | 127 | 0.2115 |
0.3257 | 128 | 0.098 |
0.3282 | 129 | 0.1811 |
0.3308 | 130 | 0.1704 |
0.3333 | 131 | 0.1494 |
0.3359 | 132 | 0.1531 |
0.3384 | 133 | 0.1032 |
0.3410 | 134 | 0.1137 |
0.3435 | 135 | 0.1271 |
0.3461 | 136 | 0.1591 |
0.3486 | 137 | 0.1586 |
0.3511 | 138 | 0.1292 |
0.3537 | 139 | 0.1115 |
0.3562 | 140 | 0.1337 |
0.3588 | 141 | 0.1298 |
0.3613 | 142 | 0.1649 |
0.3639 | 143 | 0.0855 |
0.3664 | 144 | 0.1124 |
0.3690 | 145 | 0.0764 |
0.3715 | 146 | 0.1402 |
0.3740 | 147 | 0.137 |
0.3766 | 148 | 0.0736 |
0.3791 | 149 | 0.0772 |
0.3817 | 150 | 0.1689 |
0.3842 | 151 | 0.1371 |
0.3868 | 152 | 0.1195 |
0.3893 | 153 | 0.1536 |
0.3919 | 154 | 0.1421 |
0.3944 | 155 | 0.1222 |
0.3969 | 156 | 0.1121 |
0.3995 | 157 | 0.0892 |
0.4020 | 158 | 0.1516 |
0.4046 | 159 | 0.1071 |
0.4071 | 160 | 0.1593 |
0.4097 | 161 | 0.1078 |
0.4122 | 162 | 0.1112 |
0.4148 | 163 | 0.2101 |
0.4173 | 164 | 0.2096 |
0.4198 | 165 | 0.1337 |
0.4224 | 166 | 0.1501 |
0.4249 | 167 | 0.0989 |
0.4275 | 168 | 0.0992 |
0.4300 | 169 | 0.0926 |
0.4326 | 170 | 0.0692 |
0.4351 | 171 | 0.1235 |
0.4377 | 172 | 0.1029 |
0.4402 | 173 | 0.1351 |
0.4427 | 174 | 0.0899 |
0.4453 | 175 | 0.0844 |
0.4478 | 176 | 0.1167 |
0.4504 | 177 | 0.1355 |
0.4529 | 178 | 0.092 |
0.4555 | 179 | 0.1005 |
0.4580 | 180 | 0.0891 |
0.4606 | 181 | 0.1396 |
0.4631 | 182 | 0.1024 |
0.4656 | 183 | 0.1325 |
0.4682 | 184 | 0.1061 |
0.4707 | 185 | 0.1657 |
0.4733 | 186 | 0.1141 |
0.4758 | 187 | 0.149 |
0.4784 | 188 | 0.1125 |
0.4809 | 189 | 0.1524 |
0.4835 | 190 | 0.1129 |
0.4860 | 191 | 0.1089 |
0.4885 | 192 | 0.1333 |
0.4911 | 193 | 0.1377 |
0.4936 | 194 | 0.0547 |
0.4962 | 195 | 0.1057 |
0.4987 | 196 | 0.1321 |
0.5013 | 197 | 0.0979 |
0.5038 | 198 | 0.1706 |
0.5064 | 199 | 0.1559 |
0.5089 | 200 | 0.1111 |
0.5115 | 201 | 0.1258 |
0.5140 | 202 | 0.0816 |
0.5165 | 203 | 0.1362 |
0.5191 | 204 | 0.1604 |
0.5216 | 205 | 0.1104 |
0.5242 | 206 | 0.1494 |
0.5267 | 207 | 0.1402 |
0.5293 | 208 | 0.1282 |
0.5318 | 209 | 0.1543 |
0.5344 | 210 | 0.1576 |
0.5369 | 211 | 0.2071 |
0.5394 | 212 | 0.1248 |
0.5420 | 213 | 0.1237 |
0.5445 | 214 | 0.0592 |
0.5471 | 215 | 0.1769 |
0.5496 | 216 | 0.1118 |
0.5522 | 217 | 0.1608 |
0.5547 | 218 | 0.1192 |
0.5573 | 219 | 0.0551 |
0.5598 | 220 | 0.1401 |
0.5623 | 221 | 0.2046 |
0.5649 | 222 | 0.1273 |
0.5674 | 223 | 0.1319 |
0.5700 | 224 | 0.1518 |
0.5725 | 225 | 0.0929 |
0.5751 | 226 | 0.1262 |
0.5776 | 227 | 0.1566 |
0.5802 | 228 | 0.1128 |
0.5827 | 229 | 0.1467 |
0.5852 | 230 | 0.1513 |
0.5878 | 231 | 0.1989 |
0.5903 | 232 | 0.0594 |
0.5929 | 233 | 0.0838 |
0.5954 | 234 | 0.0711 |
0.5980 | 235 | 0.0854 |
0.6005 | 236 | 0.1775 |
0.6031 | 237 | 0.118 |
0.6056 | 238 | 0.1297 |
0.6081 | 239 | 0.1092 |
0.6107 | 240 | 0.1469 |
0.6132 | 241 | 0.1203 |
0.6158 | 242 | 0.0901 |
0.6183 | 243 | 0.1179 |
0.6209 | 244 | 0.0864 |
0.6234 | 245 | 0.1277 |
0.6260 | 246 | 0.1313 |
0.6285 | 247 | 0.089 |
0.6310 | 248 | 0.0727 |
0.6336 | 249 | 0.0556 |
0.6361 | 250 | 0.0782 |
0.6387 | 251 | 0.0869 |
0.6412 | 252 | 0.0988 |
0.6438 | 253 | 0.0818 |
0.6463 | 254 | 0.1013 |
0.6489 | 255 | 0.096 |
0.6514 | 256 | 0.0622 |
0.6539 | 257 | 0.1561 |
0.6565 | 258 | 0.1282 |
0.6590 | 259 | 0.1087 |
0.6616 | 260 | 0.1312 |
0.6641 | 261 | 0.1343 |
0.6667 | 262 | 0.0955 |
0.6692 | 263 | 0.0844 |
0.6718 | 264 | 0.1209 |
0.6743 | 265 | 0.0858 |
0.6768 | 266 | 0.0714 |
0.6794 | 267 | 0.1431 |
0.6819 | 268 | 0.0632 |
0.6845 | 269 | 0.115 |
0.6870 | 270 | 0.1115 |
0.6896 | 271 | 0.1239 |
0.6921 | 272 | 0.1206 |
0.6947 | 273 | 0.1894 |
0.6972 | 274 | 0.0755 |
0.6997 | 275 | 0.0709 |
0.7023 | 276 | 0.1304 |
0.7048 | 277 | 0.1476 |
0.7074 | 278 | 0.1497 |
0.7099 | 279 | 0.113 |
0.7125 | 280 | 0.1676 |
0.7150 | 281 | 0.0999 |
0.7176 | 282 | 0.2044 |
0.7201 | 283 | 0.1125 |
0.7226 | 284 | 0.0956 |
0.7252 | 285 | 0.0956 |
0.7277 | 286 | 0.0771 |
0.7303 | 287 | 0.0712 |
0.7328 | 288 | 0.0525 |
0.7354 | 289 | 0.0689 |
0.7379 | 290 | 0.0964 |
0.7405 | 291 | 0.1068 |
0.7430 | 292 | 0.0536 |
0.7455 | 293 | 0.0861 |
0.7481 | 294 | 0.0813 |
0.7506 | 295 | 0.0885 |
0.7532 | 296 | 0.1083 |
0.7557 | 297 | 0.1124 |
0.7583 | 298 | 0.1095 |
0.7608 | 299 | 0.08 |
0.7634 | 300 | 0.1081 |
0.7659 | 301 | 0.0719 |
0.7684 | 302 | 0.0933 |
0.7710 | 303 | 0.1143 |
0.7735 | 304 | 0.065 |
0.7761 | 305 | 0.1276 |
0.7786 | 306 | 0.102 |
0.7812 | 307 | 0.186 |
0.7837 | 308 | 0.0778 |
0.7863 | 309 | 0.1419 |
0.7888 | 310 | 0.0895 |
0.7913 | 311 | 0.1154 |
0.7939 | 312 | 0.1037 |
0.7964 | 313 | 0.0711 |
0.7990 | 314 | 0.1559 |
0.8015 | 315 | 0.0755 |
0.8041 | 316 | 0.0799 |
0.8066 | 317 | 0.1137 |
0.8092 | 318 | 0.0837 |
0.8117 | 319 | 0.1052 |
0.8142 | 320 | 0.0846 |
0.8168 | 321 | 0.0715 |
0.8193 | 322 | 0.0923 |
0.8219 | 323 | 0.1397 |
0.8244 | 324 | 0.0899 |
0.8270 | 325 | 0.1414 |
0.8295 | 326 | 0.0422 |
0.8321 | 327 | 0.0748 |
0.8346 | 328 | 0.0739 |
0.8372 | 329 | 0.0855 |
0.8397 | 330 | 0.071 |
0.8422 | 331 | 0.0557 |
0.8448 | 332 | 0.1055 |
0.8473 | 333 | 0.096 |
0.8499 | 334 | 0.1083 |
0.8524 | 335 | 0.133 |
0.8550 | 336 | 0.1308 |
0.8575 | 337 | 0.0661 |
0.8601 | 338 | 0.0974 |
0.8626 | 339 | 0.1027 |
0.8651 | 340 | 0.1068 |
0.8677 | 341 | 0.1653 |
0.8702 | 342 | 0.097 |
0.8728 | 343 | 0.0845 |
0.8753 | 344 | 0.0546 |
0.8779 | 345 | 0.1273 |
0.8804 | 346 | 0.0982 |
0.8830 | 347 | 0.0893 |
0.8855 | 348 | 0.1222 |
0.8880 | 349 | 0.1072 |
0.8906 | 350 | 0.1254 |
0.8931 | 351 | 0.0679 |
0.8957 | 352 | 0.0995 |
0.8982 | 353 | 0.0878 |
0.9008 | 354 | 0.0564 |
0.9033 | 355 | 0.113 |
0.9059 | 356 | 0.0567 |
0.9084 | 357 | 0.0968 |
0.9109 | 358 | 0.1023 |
0.9135 | 359 | 0.1106 |
0.9160 | 360 | 0.091 |
0.9186 | 361 | 0.0988 |
0.9211 | 362 | 0.1374 |
0.9237 | 363 | 0.0855 |
0.9262 | 364 | 0.0824 |
0.9288 | 365 | 0.058 |
0.9313 | 366 | 0.0776 |
0.9338 | 367 | 0.1195 |
0.9364 | 368 | 0.0506 |
0.9389 | 369 | 0.0893 |
0.9415 | 370 | 0.1145 |
0.9440 | 371 | 0.0695 |
0.9466 | 372 | 0.0805 |
0.9491 | 373 | 0.0824 |
0.9517 | 374 | 0.0841 |
0.9542 | 375 | 0.0919 |
0.9567 | 376 | 0.064 |
0.9593 | 377 | 0.2194 |
0.9618 | 378 | 0.1165 |
0.9644 | 379 | 0.0888 |
0.9669 | 380 | 0.0826 |
0.9695 | 381 | 0.0687 |
0.9720 | 382 | 0.0933 |
0.9746 | 383 | 0.1337 |
0.9771 | 384 | 0.0738 |
0.9796 | 385 | 0.0749 |
0.9822 | 386 | 0.0742 |
0.9847 | 387 | 0.1111 |
0.9873 | 388 | 0.093 |
0.9898 | 389 | 0.0877 |
0.9924 | 390 | 0.0637 |
0.9949 | 391 | 0.0897 |
0.9975 | 392 | 0.0818 |
1.0 | 393 | 0.0362 |
1.0025 | 394 | 0.0561 |
1.0051 | 395 | 0.0847 |
1.0076 | 396 | 0.0752 |
1.0102 | 397 | 0.0951 |
1.0127 | 398 | 0.1069 |
1.0153 | 399 | 0.0553 |
1.0178 | 400 | 0.0929 |
1.0204 | 401 | 0.0876 |
1.0229 | 402 | 0.0381 |
1.0254 | 403 | 0.1074 |
1.0280 | 404 | 0.0763 |
1.0305 | 405 | 0.0881 |
1.0331 | 406 | 0.0481 |
1.0356 | 407 | 0.1398 |
1.0382 | 408 | 0.09 |
1.0407 | 409 | 0.1045 |
1.0433 | 410 | 0.088 |
1.0458 | 411 | 0.0751 |
1.0483 | 412 | 0.0781 |
1.0509 | 413 | 0.0844 |
1.0534 | 414 | 0.0949 |
1.0560 | 415 | 0.0467 |
1.0585 | 416 | 0.1159 |
1.0611 | 417 | 0.0511 |
1.0636 | 418 | 0.0659 |
1.0662 | 419 | 0.043 |
1.0687 | 420 | 0.0468 |
1.0712 | 421 | 0.068 |
1.0738 | 422 | 0.1022 |
1.0763 | 423 | 0.1096 |
1.0789 | 424 | 0.1113 |
1.0814 | 425 | 0.1219 |
1.0840 | 426 | 0.0852 |
1.0865 | 427 | 0.0413 |
1.0891 | 428 | 0.0797 |
1.0916 | 429 | 0.1048 |
1.0941 | 430 | 0.0494 |
1.0967 | 431 | 0.079 |
1.0992 | 432 | 0.0698 |
1.1018 | 433 | 0.0908 |
1.1043 | 434 | 0.0993 |
1.1069 | 435 | 0.0397 |
1.1094 | 436 | 0.0312 |
1.1120 | 437 | 0.089 |
1.1145 | 438 | 0.0318 |
1.1170 | 439 | 0.0356 |
1.1196 | 440 | 0.0588 |
1.1221 | 441 | 0.0311 |
1.1247 | 442 | 0.0578 |
1.1272 | 443 | 0.1313 |
1.1298 | 444 | 0.0897 |
1.1323 | 445 | 0.0798 |
1.1349 | 446 | 0.0326 |
1.1374 | 447 | 0.143 |
1.1399 | 448 | 0.0661 |
1.1425 | 449 | 0.0433 |
1.1450 | 450 | 0.0782 |
1.1476 | 451 | 0.08 |
1.1501 | 452 | 0.0505 |
1.1527 | 453 | 0.0542 |
1.1552 | 454 | 0.0755 |
1.1578 | 455 | 0.0315 |
1.1603 | 456 | 0.0667 |
1.1628 | 457 | 0.0329 |
1.1654 | 458 | 0.0791 |
1.1679 | 459 | 0.0698 |
1.1705 | 460 | 0.0194 |
1.1730 | 461 | 0.0501 |
1.1756 | 462 | 0.0449 |
1.1781 | 463 | 0.0903 |
1.1807 | 464 | 0.0503 |
1.1832 | 465 | 0.0664 |
1.1858 | 466 | 0.0457 |
1.1883 | 467 | 0.0568 |
1.1908 | 468 | 0.064 |
1.1934 | 469 | 0.0253 |
1.1959 | 470 | 0.046 |
1.1985 | 471 | 0.0279 |
1.2010 | 472 | 0.0733 |
1.2036 | 473 | 0.0463 |
1.2061 | 474 | 0.07 |
1.2087 | 475 | 0.0281 |
1.2112 | 476 | 0.0373 |
1.2137 | 477 | 0.0738 |
1.2163 | 478 | 0.0412 |
1.2188 | 479 | 0.0545 |
1.2214 | 480 | 0.0247 |
1.223 |
Jina Embeddings V3
Jina Embeddings V3 是一个多语言句子嵌入模型,支持超过100种语言,专注于句子相似度和特征提取任务。
文本嵌入
Transformers Supports Multiple Languages

J
jinaai
3.7M
911
Ms Marco MiniLM L6 V2
Apache-2.0
基于MS Marco段落排序任务训练的交叉编码器模型,用于信息检索中的查询-段落相关性评分
文本嵌入 English
M
cross-encoder
2.5M
86
Opensearch Neural Sparse Encoding Doc V2 Distill
Apache-2.0
基于蒸馏技术的稀疏检索模型,专为OpenSearch优化,支持免推理文档编码,在搜索相关性和效率上优于V1版本
文本嵌入
Transformers English

O
opensearch-project
1.8M
7
Sapbert From PubMedBERT Fulltext
Apache-2.0
基于PubMedBERT的生物医学实体表征模型,通过自对齐预训练优化语义关系捕捉
文本嵌入 English
S
cambridgeltl
1.7M
49
Gte Large
MIT
GTE-Large 是一个强大的句子转换器模型,专注于句子相似度和文本嵌入任务,在多个基准测试中表现出色。
文本嵌入 English
G
thenlper
1.5M
278
Gte Base En V1.5
Apache-2.0
GTE-base-en-v1.5 是一个英文句子转换器模型,专注于句子相似度任务,在多个文本嵌入基准测试中表现优异。
文本嵌入
Transformers Supports Multiple Languages

G
Alibaba-NLP
1.5M
63
Gte Multilingual Base
Apache-2.0
GTE Multilingual Base 是一个多语言的句子嵌入模型,支持超过50种语言,适用于句子相似度计算等任务。
文本嵌入
Transformers Supports Multiple Languages

G
Alibaba-NLP
1.2M
246
Polybert
polyBERT是一个化学语言模型,旨在实现完全由机器驱动的超快聚合物信息学。它将PSMILES字符串映射为600维密集指纹,以数值形式表示聚合物化学结构。
文本嵌入
Transformers

P
kuelumbus
1.0M
5
Bert Base Turkish Cased Mean Nli Stsb Tr
Apache-2.0
基于土耳其语BERT的句子嵌入模型,专为语义相似度任务优化
文本嵌入
Transformers Other

B
emrecan
1.0M
40
GIST Small Embedding V0
MIT
基于BAAI/bge-small-en-v1.5模型微调的文本嵌入模型,通过MEDI数据集与MTEB分类任务数据集训练,优化了检索任务的查询编码能力。
文本嵌入
Safetensors English
G
avsolatorio
945.68k
29
Featured Recommended AI Models
Llama 3 Typhoon V1.5x 8b Instruct
专为泰语设计的80亿参数指令模型,性能媲美GPT-3.5-turbo,优化了应用场景、检索增强生成、受限生成和推理任务
大型语言模型
Transformers Supports Multiple Languages

L
scb10x
3,269
16
Cadet Tiny
Openrail
Cadet-Tiny是一个基于SODA数据集训练的超小型对话模型,专为边缘设备推理设计,体积仅为Cosmo-3B模型的2%左右。
对话系统
Transformers English

C
ToddGoldfarb
2,691
6
Roberta Base Chinese Extractive Qa
基于RoBERTa架构的中文抽取式问答模型,适用于从给定文本中提取答案的任务。
问答系统 Chinese
R
uer
2,694
98