Reason ModernColBERT
基於ReasonIR數據訓練的延遲交互模型,在BRIGHT基準測試中表現出色,超越多個大型模型
下載量 798
發布時間 : 5/22/2025
模型概述
這是一個基於lightonai/GTE-ModernColBERT-v1微調而來的PyLate模型,使用reasonir-hq數據集訓練。它將句子和段落映射為128維密集向量序列,可用於語義文本相似性計算。
模型特點
延遲交互機制
採用延遲交互機制,相比密集檢索模型在推理密集型檢索任務中表現更優
高效性能
在BRIGHT基準測試中超越多個大型模型,包括比其大45倍的模型
多向量表示
將文本映射為128維密集向量序列,而非單一向量表示
模型能力
語義文本相似性計算
信息檢索
文檔重排序
使用案例
信息檢索
專業領域檢索
在生物學、地球科學等專業領域進行高效信息檢索
在BRIGHT基準測試中多個領域表現優異
技術問答檢索
針對Stack Overflow等技術問答平臺的內容檢索
在Stack Exchange分割測試中表現突出
文檔處理
文檔重排序
對初步檢索結果進行精細化重排序
提供更相關的文檔排序
🚀 Reason-ModernColBERT
Reason-ModernColBERT是一個後期交互模型,在reasonir-hq數據集上進行訓練。該模型在BRIGHT基準測試中表現出色,該基準測試旨在評估推理密集型檢索性能。Reason-ModernColBERT超越了所有規模達70億參數的現有模型(其規模是該模型的45倍以上),甚至在Stack Exchange分割數據上,平均比ReasonIR-8B(在相同數據上訓練的80億參數模型)的NDCG@10提高了2.5以上。我們將如此出色的結果歸功於後期交互,詳見評估部分。
🚀 快速開始
由於原文檔未提供快速開始的相關內容,此部分暫不展示。
✨ 主要特性
- 高性能:在BRIGHT基準測試中,超越了所有規模達70億參數的現有模型,甚至在Stack Exchange分割數據上,平均比ReasonIR-8B的NDCG@10提高了2.5以上。
- 後期交互:通過後期交互機制,提升了推理密集型檢索的性能。
📦 安裝指南
首先安裝PyLate庫:
pip install -U pylate
💻 使用示例
基礎用法
索引文檔
from pylate import indexes, models, retrieve
# Step 1: Load the ColBERT model
model = models.ColBERT(
model_name_or_path=pylate_model_id,
)
# Step 2: Initialize the Voyager index
index = indexes.Voyager(
index_folder="pylate-index",
index_name="index",
override=True, # This overwrites the existing index if any
)
# Step 3: Encode the documents
documents_ids = ["1", "2", "3"]
documents = ["document 1 text", "document 2 text", "document 3 text"]
documents_embeddings = model.encode(
documents,
batch_size=32,
is_query=False, # Ensure that it is set to False to indicate that these are documents, not queries
show_progress_bar=True,
)
# Step 4: Add document embeddings to the index by providing embeddings and corresponding ids
index.add_documents(
documents_ids=documents_ids,
documents_embeddings=documents_embeddings,
)
注意,你不必每次都重新創建索引和編碼文檔。一旦你創建了一個索引並添加了文檔,你可以通過加載它來重複使用該索引:
# To load an index, simply instantiate it with the correct folder/name and without overriding it
index = indexes.Voyager(
index_folder="pylate-index",
index_name="index",
)
檢索前k個文檔
# Step 1: Initialize the ColBERT retriever
retriever = retrieve.ColBERT(index=index)
# Step 2: Encode the queries
queries_embeddings = model.encode(
["query for document 3", "query for document 1"],
batch_size=32,
is_query=True, # # Ensure that it is set to False to indicate that these are queries
show_progress_bar=True,
)
# Step 3: Retrieve top-k documents
scores = retriever.retrieve(
queries_embeddings=queries_embeddings,
k=10, # Retrieve the top 10 matches for each query
)
高級用法
重排序
如果你只想使用ColBERT模型在你的第一階段檢索管道之上進行重排序,而不構建索引,你可以簡單地使用rank函數並傳入要重排序的查詢和文檔:
from pylate import rank, models
queries = [
"query A",
"query B",
]
documents = [
["document A", "document B"],
["document 1", "document C", "document B"],
]
documents_ids = [
[1, 2],
[1, 3, 2],
]
model = models.ColBERT(
model_name_or_path=pylate_model_id,
)
queries_embeddings = model.encode(
queries,
is_query=True,
)
documents_embeddings = model.encode(
documents,
is_query=False,
)
reranked_documents = rank.rerank(
documents_ids=documents_ids,
queries_embeddings=queries_embeddings,
documents_embeddings=documents_embeddings,
)
📚 詳細文檔
模型詳情
模型描述
屬性 | 詳情 |
---|---|
模型類型 | PyLate模型 |
基礎模型 | lightonai/GTE-ModernColBERT-v1 |
文檔長度 | 8192個標記 |
查詢長度 | 128個標記 |
輸出維度 | 128個標記 |
相似度函數 | MaxSim |
訓練數據集 | reasonir-hq |
語言 | 英語 |
模型來源
- 文檔:PyLate文檔
- 倉庫:GitHub上的PyLate
- Hugging Face:Hugging Face上的PyLate模型
完整模型架構
ColBERT(
(0): Transformer({'max_seq_length': 127, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Dense({'in_features': 768, 'out_features': 128, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
)
訓練詳情
訓練數據集
reasonir-hq
- 數據集:train at 0275f82
- 大小:100,521個訓練樣本
- 列:
query
、pos
和neg
- 基於前1000個樣本的近似統計信息:
| | 查詢 | 正樣本 | 負樣本 |
| ---- | ---- | ---- | ---- |
| 類型 | 字符串 | 字符串 | 字符串 |
| 詳情 |
- 最小:38個標記
- 平均:97.84個標記
- 最大:128個標記
- 最小:85個標記
- 平均:127.63個標記
- 最大:128個標記
- 最小:81個標記
- 平均:127.77個標記
- 最大:128個標記
- 樣本:
| 查詢 | 正樣本 | 負樣本 |
| ---- | ---- | ---- |
|
Given this reasoning-intensive query, find relevant documents that could help answer the question. A researcher is analyzing a sound signal represented by the equation f(t) = 2sin(3πt) + sin(5πt) + 0.5sin(7πt). Using the Fourier transform, what are the frequencies, amplitudes, and phases of the individual sinusoidal components in the signal?
|A sound signal is given by the equation f(t) = sin(2πt) + sin(4πt) + sin(6πt) where t is time in seconds. Use Fourier transform to find the frequencies, amplitudes, and phases of the individual sinusoidal components in the signal.
|
To find the frequencies, amplitudes, and phases of the individual sinusoidal components in the signal f(t) = sin(2πt) + sin(4πt) + sin(6πt), we can use the Fourier transform. The Fourier transform of a continuous function f(t) is given by:
F(ω) = ∫[f(t) * e^(-jωt)] dt
where F(ω) is the Fourier transform of f(t), ω is the angular frequency, and j is the imaginary unit (j^2 = -1). In this case, f(t) is already given as a sum of sinusoidal functions, so we can directly identify the frequencies, amplitudes, and phases of the individual components.
1. First component: sin(2πt)
- Frequency: The angular frequency is 2π, so the frequency is ω/(2π) = 1 Hz.
- Amplitude: The coefficient of the sine function is 1, so the amplitude is 1.
- Phase: There is no phase shi...The Fourier transform is widely used in various fields, including engineering, physics, and data analysis. It is a powerful tool for decomposing a signal into its constituent frequencies. In music, for example, the Fourier transform can be used to analyze the frequency components of a sound wave. By applying the Fourier transform to a sound signal, one can identify the different frequencies present in the signal, as well as their relative amplitudes. This information can be useful in a variety of applications, such as sound filtering and audio processing. The Fourier transform can also be used to analyze images and other types of data. In image processing, the Fourier transform can be used to filter out noise and other unwanted features from an image. It can also be used to compress images by representing them in the frequency domain. In addition to its many practical applications, the Fourier transform also has a number of interesting theoretical properties. For example, it has been ...
| |Given this reasoning-intensive query, find relevant documents that could help answer the question. A manufacturer is designing a cone-shaped container with a fixed volume of 200π cubic centimeters. The container's height is 12 centimeters, and the radius of the base is unknown. If the manufacturer wants to minimize the surface area of the container while maintaining its volume, what should be the radius of the base?
|A right circular cone has a radius of 6cm and a slant height of 10cm. Determine the surface area of the cone.
|
To find the surface area of a right circular cone, we need to calculate the area of the base and the lateral surface area, and then add them together.
The base of the cone is a circle with radius r = 6 cm. The area of the base (A_base) can be found using the formula for the area of a circle:
A_base = πr^2
A_base = π(6 cm)^2
A_base = 36π cm^2
The lateral surface area (A_lateral) can be found using the formula for the lateral surface area of a cone:
A_lateral = πrs, where r is the radius and s is the slant height.
Given that the slant height s = 10 cm, we can calculate the lateral surface area:
A_lateral = π(6 cm)(10 cm)
A_lateral = 60π cm^2
Now, we can find the total surface area (A_total) by adding the base area and the lateral surface area:
A_total = A_base + A_lateral
A_total = 36π cm^2 + 60π cm^2
A_total = 96π cm^2
The surface area of the cone is 96π cm^2.Torus-Shaped Containers in Chemical Engineering - New Designs and ApplicationsTorus-shaped containers are commonly used in chemical engineering for storing and transporting fluids. These containers have a distinctive doughnut shape, with a central hole and a circular cross-section. In this article, we will explore the design and applications of torus-shaped containers in chemical engineering.One of the main advantages of torus-shaped containers is their high volume-to-surface-area ratio. This makes them ideal for storing large quantities of fluids while minimizing the amount of material needed for construction. Additionally, the curved shape of the container provides added strength and stability, making it less prone to rupture or leakage.The design of torus-shaped containers typically involves the use of computer-aided design (CAD) software to create detailed models of the container's geometry. Engineers can then use these models to simulate various scenarios, such as fluid flow and ...
| |Given this reasoning-intensive query, find relevant documents that could help answer the question. On the xy-coordinate plane, points A and B are given as A(2, 4) and B(8, -3). Determine the coordinates of the point on line segment AB that is three times as far from A as it is from B.
|On the xy co-ordinate plane, point C is (5,-2) and point D is (-1,1.5). The point on line segment CD that is twice as far from C as from D is:
|
Answer Choices: (A) (1,-1) (B) (1,1) (C) (2,0.25) (D) (3,0.5) (E) (3,1)
Let's think about the multi-choice question step by step.
We want the point on the line that is twice as far from C as it is from D. We can examine the x and y coordinates separately since they are independent.
*It should be noted that there are two solutions to this problem, one point between C and D, and another point with D in the middle of C and the point. We can quickly look at the answer choices and see that all the points are between C and D, therefore we can search for that point using the following method:
Taking the x-coordinate first, the distance between C and D is |(x-coordinate ofC - (x-coordinate ofD|= |5 - (-1)| = 6
The x-coordinate that is twice as far from C as it is from D (and in between C andD will be 4 units from C and 2 units from D. So the ...The concept of midpoint is often useful in various mathematical problems, but sometimes we need to find other points that divide a line segment in a particular ratio. One common scenario is when we need to find the point that divides the line segment in the ratio of the other two points. Let's consider an example to understand this better. Suppose we have two points E(3, 4) and F(7, -2) on the xy-coordinate plane, and we want to find the point G on the line segment EF such that EG:GF = 2:5. To solve this problem, we can use the concept of section formula, which states that if a point P(x, y) divides the line segment joining the points A(x1, y1) and B(x2, y2) in the ratio m:n, then the coordinates of P are ((mx2+nx1)/(m+n), (my2+ny1)/(m+n)). Using this formula, we can find the coordinates of point G. First, we need to find the difference in x-coordinates and y-coordinates of points E and F. The difference in x-coordinates is 7 - 3 = 4, and the difference in y-coordinates is -2 - 4 = -6...
| - 損失函數:
pylate.losses.cached_contrastive.CachedContrastive
訓練超參數
非默認超參數
per_device_train_batch_size
:256per_device_eval_batch_size
:256learning_rate
:1e-05bf16
:Truedataloader_num_workers
:8
所有超參數
點擊展開
overwrite_output_dir
:Falsedo_predict
:Falseeval_strategy
:noprediction_loss_only
:Trueper_device_train_batch_size
:256per_device_eval_batch_size
:256per_gpu_train_batch_size
:Noneper_gpu_eval_batch_size
:Nonegradient_accumulation_steps
:1eval_accumulation_steps
:Nonetorch_empty_cache_steps
:Nonelearning_rate
:1e-05weight_decay
:0.0adam_beta1
:0.9adam_beta2
:0.999adam_epsilon
:1e-08max_grad_norm
:1.0num_train_epochs
:3max_steps
:-1lr_scheduler_type
:linearlr_scheduler_kwargs
:{}warmup_ratio
:0.0warmup_steps
:0log_level
:passivelog_level_replica
:warninglog_on_each_node
:Truelogging_nan_inf_filter
:Truesave_safetensors
:Truesave_on_each_node
:Falsesave_only_model
:Falserestore_callback_states_from_checkpoint
:Falseno_cuda
:Falseuse_cpu
:Falseuse_mps_device
:Falseseed
:42data_seed
:Nonejit_mode_eval
:Falseuse_ipex
:Falsebf16
:Truefp16
:Falsefp16_opt_level
:O1half_precision_backend
:autobf16_full_eval
:Falsefp16_full_eval
:Falsetf32
:Nonelocal_rank
:0ddp_backend
:Nonetpu_num_cores
:Nonetpu_metrics_debug
:Falsedebug
:[]dataloader_drop_last
:Falsedataloader_num_workers
:8dataloader_prefetch_factor
:Nonepast_index
:-1disable_tqdm
:Falseremove_unused_columns
:Truelabel_names
:Noneload_best_model_at_end
:Falseignore_data_skip
:Falsefsdp
:[]fsdp_min_num_params
:0fsdp_config
:{'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
:Noneaccelerator_config
:{'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
:Nonelabel_smoothing_factor
:0.0optim
:adamw_torchoptim_args
:Noneadafactor
:Falsegroup_by_length
:Falselength_column_name
:lengthddp_find_unused_parameters
:Noneddp_bucket_cap_mb
:Noneddp_broadcast_buffers
:Falsedataloader_pin_memory
:Truedataloader_persistent_workers
:Falseskip_memory_metrics
:Trueuse_legacy_prediction_loop
:Falsepush_to_hub
:Falseresume_from_checkpoint
:Nonehub_model_id
:Nonehub_strategy
:every_savehub_private_repo
:Nonehub_always_push
:Falsegradient_checkpointing
:Falsegradient_checkpointing_kwargs
:Noneinclude_inputs_for_metrics
:Falseinclude_for_metrics
:[]eval_do_concat_batches
:Truefp16_backend
:autopush_to_hub_model_id
:Nonepush_to_hub_organization
:Nonemp_parameters
:auto_find_batch_size
:Falsefull_determinism
:Falsetorchdynamo
:Noneray_scope
:lastddp_timeout
:1800torch_compile
:Falsetorch_compile_backend
:Nonetorch_compile_mode
:Nonedispatch_batches
:Nonesplit_batches
:Noneinclude_tokens_per_second
:Falseinclude_num_input_tokens_seen
:Falseneftune_noise_alpha
:Noneoptim_target_modules
:Nonebatch_eval_metrics
:Falseeval_on_start
:Falseuse_liger_kernel
:Falseeval_use_gather_object
:Falseaverage_tokens_across_devices
:Falseprompts
:Nonebatch_sampler
:batch_samplermulti_dataset_batch_sampler
:proportional
訓練日誌
點擊展開
輪次 | 步數 | 訓練損失 |
---|---|---|
0.0025 | 1 | 4.9684 |
0.0051 | 2 | 4.6956 |
0.0076 | 3 | 4.5076 |
0.0102 | 4 | 4.3723 |
0.0127 | 5 | 4.3305 |
0.0153 | 6 | 4.0355 |
0.0178 | 7 | 3.7886 |
0.0204 | 8 | 3.6133 |
0.0229 | 9 | 3.2395 |
0.0254 | 10 | 3.1481 |
0.0280 | 11 | 2.7444 |
0.0305 | 12 | 2.4946 |
0.0331 | 13 | 2.333 |
0.0356 | 14 | 2.2471 |
0.0382 | 15 | 1.9117 |
0.0407 | 16 | 1.6753 |
0.0433 | 17 | 1.2413 |
0.0458 | 18 | 1.1201 |
0.0483 | 19 | 1.0335 |
0.0509 | 20 | 1.0583 |
0.0534 | 21 | 1.067 |
0.0560 | 22 | 0.7056 |
0.0585 | 23 | 0.761 |
0.0611 | 24 | 0.5501 |
0.0636 | 25 | 0.6486 |
0.0662 | 26 | 0.4639 |
0.0687 | 27 | 0.3885 |
0.0712 | 28 | 0.4982 |
0.0738 | 29 | 0.4784 |
0.0763 | 30 | 0.5189 |
0.0789 | 31 | 0.4824 |
0.0814 | 32 | 0.4183 |
0.0840 | 33 | 0.4945 |
0.0865 | 34 | 0.2579 |
0.0891 | 35 | 0.3312 |
0.0916 | 36 | 0.4035 |
0.0941 | 37 | 0.305 |
0.0967 | 38 | 0.2898 |
0.0992 | 39 | 0.2899 |
0.1018 | 40 | 0.2713 |
0.1043 | 41 | 0.3017 |
0.1069 | 42 | 0.2395 |
0.1094 | 43 | 0.1548 |
0.1120 | 44 | 0.2468 |
0.1145 | 45 | 0.1876 |
0.1170 | 46 | 0.2322 |
0.1196 | 47 | 0.2823 |
0.1221 | 48 | 0.2158 |
0.1247 | 49 | 0.2679 |
0.1272 | 50 | 0.273 |
0.1298 | 51 | 0.2876 |
0.1323 | 52 | 0.197 |
0.1349 | 53 | 0.1282 |
0.1374 | 54 | 0.3355 |
0.1399 | 55 | 0.1941 |
0.1425 | 56 | 0.1873 |
0.1450 | 57 | 0.2288 |
0.1476 | 58 | 0.2802 |
0.1501 | 59 | 0.2087 |
0.1527 | 60 | 0.2239 |
0.1552 | 61 | 0.225 |
0.1578 | 62 | 0.1582 |
0.1603 | 63 | 0.1972 |
0.1628 | 64 | 0.1632 |
0.1654 | 65 | 0.2101 |
0.1679 | 66 | 0.2084 |
0.1705 | 67 | 0.1499 |
0.1730 | 68 | 0.1467 |
0.1756 | 69 | 0.1428 |
0.1781 | 70 | 0.2298 |
0.1807 | 71 | 0.1883 |
0.1832 | 72 | 0.22 |
0.1858 | 73 | 0.1988 |
0.1883 | 74 | 0.2091 |
0.1908 | 75 | 0.1948 |
0.1934 | 76 | 0.1348 |
0.1959 | 77 | 0.112 |
0.1985 | 78 | 0.1474 |
0.2010 | 79 | 0.1949 |
0.2036 | 80 | 0.1664 |
0.2061 | 81 | 0.1807 |
0.2087 | 82 | 0.1403 |
0.2112 | 83 | 0.1225 |
0.2137 | 84 | 0.1919 |
0.2163 | 85 | 0.1403 |
0.2188 | 86 | 0.1402 |
0.2214 | 87 | 0.0981 |
0.2239 | 88 | 0.1214 |
0.2265 | 89 | 0.1755 |
0.2290 | 90 | 0.1509 |
0.2316 | 91 | 0.1551 |
0.2341 | 92 | 0.176 |
0.2366 | 93 | 0.1648 |
0.2392 | 94 | 0.1622 |
0.2417 | 95 | 0.1372 |
0.2443 | 96 | 0.1016 |
0.2468 | 97 | 0.1134 |
0.2494 | 98 | 0.1436 |
0.2519 | 99 | 0.1478 |
0.2545 | 100 | 0.2065 |
0.2570 | 101 | 0.1901 |
0.2595 | 102 | 0.1859 |
0.2621 | 103 | 0.212 |
0.2646 | 104 | 0.2179 |
0.2672 | 105 | 0.2471 |
0.2697 | 106 | 0.1769 |
0.2723 | 107 | 0.1593 |
0.2748 | 108 | 0.204 |
0.2774 | 109 | 0.1496 |
0.2799 | 110 | 0.1212 |
0.2824 | 111 | 0.1282 |
0.2850 | 112 | 0.1126 |
0.2875 | 113 | 0.1254 |
0.2901 | 114 | 0.1422 |
0.2926 | 115 | 0.1266 |
0.2952 | 116 | 0.1305 |
0.2977 | 117 | 0.1283 |
0.3003 | 118 | 0.0737 |
0.3028 | 119 | 0.1237 |
0.3053 | 120 | 0.1185 |
0.3079 | 121 | 0.0891 |
0.3104 | 122 | 0.2312 |
0.3130 | 123 | 0.2384 |
0.3155 | 124 | 0.155 |
0.3181 | 125 | 0.1118 |
0.3206 | 126 | 0.1575 |
0.3232 | 127 | 0.2115 |
0.3257 | 128 | 0.098 |
0.3282 | 129 | 0.1811 |
0.3308 | 130 | 0.1704 |
0.3333 | 131 | 0.1494 |
0.3359 | 132 | 0.1531 |
0.3384 | 133 | 0.1032 |
0.3410 | 134 | 0.1137 |
0.3435 | 135 | 0.1271 |
0.3461 | 136 | 0.1591 |
0.3486 | 137 | 0.1586 |
0.3511 | 138 | 0.1292 |
0.3537 | 139 | 0.1115 |
0.3562 | 140 | 0.1337 |
0.3588 | 141 | 0.1298 |
0.3613 | 142 | 0.1649 |
0.3639 | 143 | 0.0855 |
0.3664 | 144 | 0.1124 |
0.3690 | 145 | 0.0764 |
0.3715 | 146 | 0.1402 |
0.3740 | 147 | 0.137 |
0.3766 | 148 | 0.0736 |
0.3791 | 149 | 0.0772 |
0.3817 | 150 | 0.1689 |
0.3842 | 151 | 0.1371 |
0.3868 | 152 | 0.1195 |
0.3893 | 153 | 0.1536 |
0.3919 | 154 | 0.1421 |
0.3944 | 155 | 0.1222 |
0.3969 | 156 | 0.1121 |
0.3995 | 157 | 0.0892 |
0.4020 | 158 | 0.1516 |
0.4046 | 159 | 0.1071 |
0.4071 | 160 | 0.1593 |
0.4097 | 161 | 0.1078 |
0.4122 | 162 | 0.1112 |
0.4148 | 163 | 0.2101 |
0.4173 | 164 | 0.2096 |
0.4198 | 165 | 0.1337 |
0.4224 | 166 | 0.1501 |
0.4249 | 167 | 0.0989 |
0.4275 | 168 | 0.0992 |
0.4300 | 169 | 0.0926 |
0.4326 | 170 | 0.0692 |
0.4351 | 171 | 0.1235 |
0.4377 | 172 | 0.1029 |
0.4402 | 173 | 0.1351 |
0.4427 | 174 | 0.0899 |
0.4453 | 175 | 0.0844 |
0.4478 | 176 | 0.1167 |
0.4504 | 177 | 0.1355 |
0.4529 | 178 | 0.092 |
0.4555 | 179 | 0.1005 |
0.4580 | 180 | 0.0891 |
0.4606 | 181 | 0.1396 |
0.4631 | 182 | 0.1024 |
0.4656 | 183 | 0.1325 |
0.4682 | 184 | 0.1061 |
0.4707 | 185 | 0.1657 |
0.4733 | 186 | 0.1141 |
0.4758 | 187 | 0.149 |
0.4784 | 188 | 0.1125 |
0.4809 | 189 | 0.1524 |
0.4835 | 190 | 0.1129 |
0.4860 | 191 | 0.1089 |
0.4885 | 192 | 0.1333 |
0.4911 | 193 | 0.1377 |
0.4936 | 194 | 0.0547 |
0.4962 | 195 | 0.1057 |
0.4987 | 196 | 0.1321 |
0.5013 | 197 | 0.0979 |
0.5038 | 198 | 0.1706 |
0.5064 | 199 | 0.1559 |
0.5089 | 200 | 0.1111 |
0.5115 | 201 | 0.1258 |
0.5140 | 202 | 0.0816 |
0.5165 | 203 | 0.1362 |
0.5191 | 204 | 0.1604 |
0.5216 | 205 | 0.1104 |
0.5242 | 206 | 0.1494 |
0.5267 | 207 | 0.1402 |
0.5293 | 208 | 0.1282 |
0.5318 | 209 | 0.1543 |
0.5344 | 210 | 0.1576 |
0.5369 | 211 | 0.2071 |
0.5394 | 212 | 0.1248 |
0.5420 | 213 | 0.1237 |
0.5445 | 214 | 0.0592 |
0.5471 | 215 | 0.1769 |
0.5496 | 216 | 0.1118 |
0.5522 | 217 | 0.1608 |
0.5547 | 218 | 0.1192 |
0.5573 | 219 | 0.0551 |
0.5598 | 220 | 0.1401 |
0.5623 | 221 | 0.2046 |
0.5649 | 222 | 0.1273 |
0.5674 | 223 | 0.1319 |
0.5700 | 224 | 0.1518 |
0.5725 | 225 | 0.0929 |
0.5751 | 226 | 0.1262 |
0.5776 | 227 | 0.1566 |
0.5802 | 228 | 0.1128 |
0.5827 | 229 | 0.1467 |
0.5852 | 230 | 0.1513 |
0.5878 | 231 | 0.1989 |
0.5903 | 232 | 0.0594 |
0.5929 | 233 | 0.0838 |
0.5954 | 234 | 0.0711 |
0.5980 | 235 | 0.0854 |
0.6005 | 236 | 0.1775 |
0.6031 | 237 | 0.118 |
0.6056 | 238 | 0.1297 |
0.6081 | 239 | 0.1092 |
0.6107 | 240 | 0.1469 |
0.6132 | 241 | 0.1203 |
0.6158 | 242 | 0.0901 |
0.6183 | 243 | 0.1179 |
0.6209 | 244 | 0.0864 |
0.6234 | 245 | 0.1277 |
0.6260 | 246 | 0.1313 |
0.6285 | 247 | 0.089 |
0.6310 | 248 | 0.0727 |
0.6336 | 249 | 0.0556 |
0.6361 | 250 | 0.0782 |
0.6387 | 251 | 0.0869 |
0.6412 | 252 | 0.0988 |
0.6438 | 253 | 0.0818 |
0.6463 | 254 | 0.1013 |
0.6489 | 255 | 0.096 |
0.6514 | 256 | 0.0622 |
0.6539 | 257 | 0.1561 |
0.6565 | 258 | 0.1282 |
0.6590 | 259 | 0.1087 |
0.6616 | 260 | 0.1312 |
0.6641 | 261 | 0.1343 |
0.6667 | 262 | 0.0955 |
0.6692 | 263 | 0.0844 |
0.6718 | 264 | 0.1209 |
0.6743 | 265 | 0.0858 |
0.6768 | 266 | 0.0714 |
0.6794 | 267 | 0.1431 |
0.6819 | 268 | 0.0632 |
0.6845 | 269 | 0.115 |
0.6870 | 270 | 0.1115 |
0.6896 | 271 | 0.1239 |
0.6921 | 272 | 0.1206 |
0.6947 | 273 | 0.1894 |
0.6972 | 274 | 0.0755 |
0.6997 | 275 | 0.0709 |
0.7023 | 276 | 0.1304 |
0.7048 | 277 | 0.1476 |
0.7074 | 278 | 0.1497 |
0.7099 | 279 | 0.113 |
0.7125 | 280 | 0.1676 |
0.7150 | 281 | 0.0999 |
0.7176 | 282 | 0.2044 |
0.7201 | 283 | 0.1125 |
0.7226 | 284 | 0.0956 |
0.7252 | 285 | 0.0956 |
0.7277 | 286 | 0.0771 |
0.7303 | 287 | 0.0712 |
0.7328 | 288 | 0.0525 |
0.7354 | 289 | 0.0689 |
0.7379 | 290 | 0.0964 |
0.7405 | 291 | 0.1068 |
0.7430 | 292 | 0.0536 |
0.7455 | 293 | 0.0861 |
0.7481 | 294 | 0.0813 |
0.7506 | 295 | 0.0885 |
0.7532 | 296 | 0.1083 |
0.7557 | 297 | 0.1124 |
0.7583 | 298 | 0.1095 |
0.7608 | 299 | 0.08 |
0.7634 | 300 | 0.1081 |
0.7659 | 301 | 0.0719 |
0.7684 | 302 | 0.0933 |
0.7710 | 303 | 0.1143 |
0.7735 | 304 | 0.065 |
0.7761 | 305 | 0.1276 |
0.7786 | 306 | 0.102 |
0.7812 | 307 | 0.186 |
0.7837 | 308 | 0.0778 |
0.7863 | 309 | 0.1419 |
0.7888 | 310 | 0.0895 |
0.7913 | 311 | 0.1154 |
0.7939 | 312 | 0.1037 |
0.7964 | 313 | 0.0711 |
0.7990 | 314 | 0.1559 |
0.8015 | 315 | 0.0755 |
0.8041 | 316 | 0.0799 |
0.8066 | 317 | 0.1137 |
0.8092 | 318 | 0.0837 |
0.8117 | 319 | 0.1052 |
0.8142 | 320 | 0.0846 |
0.8168 | 321 | 0.0715 |
0.8193 | 322 | 0.0923 |
0.8219 | 323 | 0.1397 |
0.8244 | 324 | 0.0899 |
0.8270 | 325 | 0.1414 |
0.8295 | 326 | 0.0422 |
0.8321 | 327 | 0.0748 |
0.8346 | 328 | 0.0739 |
0.8372 | 329 | 0.0855 |
0.8397 | 330 | 0.071 |
0.8422 | 331 | 0.0557 |
0.8448 | 332 | 0.1055 |
0.8473 | 333 | 0.096 |
0.8499 | 334 | 0.1083 |
0.8524 | 335 | 0.133 |
0.8550 | 336 | 0.1308 |
0.8575 | 337 | 0.0661 |
0.8601 | 338 | 0.0974 |
0.8626 | 339 | 0.1027 |
0.8651 | 340 | 0.1068 |
0.8677 | 341 | 0.1653 |
0.8702 | 342 | 0.097 |
0.8728 | 343 | 0.0845 |
0.8753 | 344 | 0.0546 |
0.8779 | 345 | 0.1273 |
0.8804 | 346 | 0.0982 |
0.8830 | 347 | 0.0893 |
0.8855 | 348 | 0.1222 |
0.8880 | 349 | 0.1072 |
0.8906 | 350 | 0.1254 |
0.8931 | 351 | 0.0679 |
0.8957 | 352 | 0.0995 |
0.8982 | 353 | 0.0878 |
0.9008 | 354 | 0.0564 |
0.9033 | 355 | 0.113 |
0.9059 | 356 | 0.0567 |
0.9084 | 357 | 0.0968 |
0.9109 | 358 | 0.1023 |
0.9135 | 359 | 0.1106 |
0.9160 | 360 | 0.091 |
0.9186 | 361 | 0.0988 |
0.9211 | 362 | 0.1374 |
0.9237 | 363 | 0.0855 |
0.9262 | 364 | 0.0824 |
0.9288 | 365 | 0.058 |
0.9313 | 366 | 0.0776 |
0.9338 | 367 | 0.1195 |
0.9364 | 368 | 0.0506 |
0.9389 | 369 | 0.0893 |
0.9415 | 370 | 0.1145 |
0.9440 | 371 | 0.0695 |
0.9466 | 372 | 0.0805 |
0.9491 | 373 | 0.0824 |
0.9517 | 374 | 0.0841 |
0.9542 | 375 | 0.0919 |
0.9567 | 376 | 0.064 |
0.9593 | 377 | 0.2194 |
0.9618 | 378 | 0.1165 |
0.9644 | 379 | 0.0888 |
0.9669 | 380 | 0.0826 |
0.9695 | 381 | 0.0687 |
0.9720 | 382 | 0.0933 |
0.9746 | 383 | 0.1337 |
0.9771 | 384 | 0.0738 |
0.9796 | 385 | 0.0749 |
0.9822 | 386 | 0.0742 |
0.9847 | 387 | 0.1111 |
0.9873 | 388 | 0.093 |
0.9898 | 389 | 0.0877 |
0.9924 | 390 | 0.0637 |
0.9949 | 391 | 0.0897 |
0.9975 | 392 | 0.0818 |
1.0 | 393 | 0.0362 |
1.0025 | 394 | 0.0561 |
1.0051 | 395 | 0.0847 |
1.0076 | 396 | 0.0752 |
1.0102 | 397 | 0.0951 |
1.0127 | 398 | 0.1069 |
1.0153 | 399 | 0.0553 |
1.0178 | 400 | 0.0929 |
1.0204 | 401 | 0.0876 |
1.0229 | 402 | 0.0381 |
1.0254 | 403 | 0.1074 |
1.0280 | 404 | 0.0763 |
1.0305 | 405 | 0.0881 |
1.0331 | 406 | 0.0481 |
1.0356 | 407 | 0.1398 |
1.0382 | 408 | 0.09 |
1.0407 | 409 | 0.1045 |
1.0433 | 410 | 0.088 |
1.0458 | 411 | 0.0751 |
1.0483 | 412 | 0.0781 |
1.0509 | 413 | 0.0844 |
1.0534 | 414 | 0.0949 |
1.0560 | 415 | 0.0467 |
1.0585 | 416 | 0.1159 |
1.0611 | 417 | 0.0511 |
1.0636 | 418 | 0.0659 |
1.0662 | 419 | 0.043 |
1.0687 | 420 | 0.0468 |
1.0712 | 421 | 0.068 |
1.0738 | 422 | 0.1022 |
1.0763 | 423 | 0.1096 |
1.0789 | 424 | 0.1113 |
1.0814 | 425 | 0.1219 |
1.0840 | 426 | 0.0852 |
1.0865 | 427 | 0.0413 |
1.0891 | 428 | 0.0797 |
1.0916 | 429 | 0.1048 |
1.0941 | 430 | 0.0494 |
1.0967 | 431 | 0.079 |
1.0992 | 432 | 0.0698 |
1.1018 | 433 | 0.0908 |
1.1043 | 434 | 0.0993 |
1.1069 | 435 | 0.0397 |
1.1094 | 436 | 0.0312 |
1.1120 | 437 | 0.089 |
1.1145 | 438 | 0.0318 |
1.1170 | 439 | 0.0356 |
1.1196 | 440 | 0.0588 |
1.1221 | 441 | 0.0311 |
1.1247 | 442 | 0.0578 |
1.1272 | 443 | 0.1313 |
1.1298 | 444 | 0.0897 |
1.1323 | 445 | 0.0798 |
1.1349 | 446 | 0.0326 |
1.1374 | 447 | 0.143 |
1.1399 | 448 | 0.0661 |
1.1425 | 449 | 0.0433 |
1.1450 | 450 | 0.0782 |
1.1476 | 451 | 0.08 |
1.1501 | 452 | 0.0505 |
1.1527 | 453 | 0.0542 |
1.1552 | 454 | 0.0755 |
1.1578 | 455 | 0.0315 |
1.1603 | 456 | 0.0667 |
1.1628 | 457 | 0.0329 |
1.1654 | 458 | 0.0791 |
1.1679 | 459 | 0.0698 |
1.1705 | 460 | 0.0194 |
1.1730 | 461 | 0.0501 |
1.1756 | 462 | 0.0449 |
1.1781 | 463 | 0.0903 |
1.1807 | 464 | 0.0503 |
1.1832 | 465 | 0.0664 |
1.1858 | 466 | 0.0457 |
1.1883 | 467 | 0.0568 |
1.1908 | 468 | 0.064 |
1.1934 | 469 | 0.0253 |
1.1959 | 470 | 0.046 |
1.1985 | 471 | 0.0279 |
1.2010 | 472 | 0.0733 |
1.2036 | 473 | 0.0463 |
1.2061 | 474 | 0.07 |
1.2087 | 475 | 0.0281 |
1.2112 | 476 | 0.0373 |
1.2137 | 477 | 0.0738 |
1.2163 | 478 | 0.0412 |
1.2188 | 479 | 0.0545 |
1.2214 | 480 | 0.0247 |
1.223 |
Jina Embeddings V3
Jina Embeddings V3 是一個多語言句子嵌入模型,支持超過100種語言,專注於句子相似度和特徵提取任務。
文本嵌入
Transformers 支持多種語言

J
jinaai
3.7M
911
Ms Marco MiniLM L6 V2
Apache-2.0
基於MS Marco段落排序任務訓練的交叉編碼器模型,用於信息檢索中的查詢-段落相關性評分
文本嵌入 英語
M
cross-encoder
2.5M
86
Opensearch Neural Sparse Encoding Doc V2 Distill
Apache-2.0
基於蒸餾技術的稀疏檢索模型,專為OpenSearch優化,支持免推理文檔編碼,在搜索相關性和效率上優於V1版本
文本嵌入
Transformers 英語

O
opensearch-project
1.8M
7
Sapbert From PubMedBERT Fulltext
Apache-2.0
基於PubMedBERT的生物醫學實體表徵模型,通過自對齊預訓練優化語義關係捕捉
文本嵌入 英語
S
cambridgeltl
1.7M
49
Gte Large
MIT
GTE-Large 是一個強大的句子轉換器模型,專注於句子相似度和文本嵌入任務,在多個基準測試中表現出色。
文本嵌入 英語
G
thenlper
1.5M
278
Gte Base En V1.5
Apache-2.0
GTE-base-en-v1.5 是一個英文句子轉換器模型,專注於句子相似度任務,在多個文本嵌入基準測試中表現優異。
文本嵌入
Transformers 支持多種語言

G
Alibaba-NLP
1.5M
63
Gte Multilingual Base
Apache-2.0
GTE Multilingual Base 是一個多語言的句子嵌入模型,支持超過50種語言,適用於句子相似度計算等任務。
文本嵌入
Transformers 支持多種語言

G
Alibaba-NLP
1.2M
246
Polybert
polyBERT是一個化學語言模型,旨在實現完全由機器驅動的超快聚合物信息學。它將PSMILES字符串映射為600維密集指紋,以數值形式表示聚合物化學結構。
文本嵌入
Transformers

P
kuelumbus
1.0M
5
Bert Base Turkish Cased Mean Nli Stsb Tr
Apache-2.0
基於土耳其語BERT的句子嵌入模型,專為語義相似度任務優化
文本嵌入
Transformers 其他

B
emrecan
1.0M
40
GIST Small Embedding V0
MIT
基於BAAI/bge-small-en-v1.5模型微調的文本嵌入模型,通過MEDI數據集與MTEB分類任務數據集訓練,優化了檢索任務的查詢編碼能力。
文本嵌入
Safetensors 英語
G
avsolatorio
945.68k
29
精選推薦AI模型
Llama 3 Typhoon V1.5x 8b Instruct
專為泰語設計的80億參數指令模型,性能媲美GPT-3.5-turbo,優化了應用場景、檢索增強生成、受限生成和推理任務
大型語言模型
Transformers 支持多種語言

L
scb10x
3,269
16
Cadet Tiny
Openrail
Cadet-Tiny是一個基於SODA數據集訓練的超小型對話模型,專為邊緣設備推理設計,體積僅為Cosmo-3B模型的2%左右。
對話系統
Transformers 英語

C
ToddGoldfarb
2,691
6
Roberta Base Chinese Extractive Qa
基於RoBERTa架構的中文抽取式問答模型,適用於從給定文本中提取答案的任務。
問答系統 中文
R
uer
2,694
98