Snoweu V2
S
Snoweu V2
Developed by fjavigv
Sentence embedding model based on Snowflake Arctic architecture, focusing on sentence similarity calculation and feature extraction
Downloads 604
Release Time : 3/19/2025
Model Overview
This model is a sentence transformer specifically designed for calculating sentence similarity and extracting sentence features. It employs nested loss and multiple negative ranking loss for training, suitable for tasks like information retrieval and semantic search.
Model Features
Efficient sentence embedding
Capable of converting sentences into high-dimensional vector representations for similarity calculation and semantic analysis
Multiple loss functions
Utilizes nested loss and multiple negative ranking loss for training to enhance model performance
Large-scale training data
Trained on 29,911 data points, demonstrating strong generalization capabilities
Model Capabilities
Sentence similarity calculation
Semantic feature extraction
Information retrieval
Semantic search
Text matching
Use Cases
Information retrieval
Document similarity search
Finding the most similar documents to a query sentence within a large corpus
Achieved 0.98 accuracy@10 in testing
Business analysis
Business strategy matching
Identifying document passages relevant to specific business strategies
🚀 SentenceTransformer based on Snowflake/snowflake-arctic-embed-m-v1.5
This model is a fine - tuned sentence - transformers model derived from [Snowflake/snowflake - arctic - embed - m - v1.5](https://huggingface.co/Snowflake/snowflake - arctic - embed - m - v1.5). It maps sentences and paragraphs to a 768 - dimensional dense vector space, and can be applied in semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
✨ Features
- Maps sentences and paragraphs to a 768 - dimensional dense vector space.
- Suitable for various natural language processing tasks such as semantic textual similarity, semantic search, and more.
📦 Installation
First, install the Sentence Transformers library:
pip install -U sentence-transformers
💻 Usage Examples
Basic Usage
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'What is the definition of a preliminary economic assessment in the context of evaluating projects for the recovery of critical raw materials?',
'(39)\n\n‘preliminary economic assessment’ means an early - stage, conceptual assessment of the potential economic viability of a project for the recovery of critical raw materials from extractive waste;\n\n(40)\n\n‘magnetic resonance imaging device’ means a non - invasive medical device that uses magnetic fields to make anatomical images or any other device that uses magnetic fields to make images of the inside of object;\n\n(41)\n\n‘wind energy generator’ means the part of an onshore or offshore wind turbine that converts the mechanical energy of the rotor into electrical energy;\n\n(42)',
'For the purposes of the first subparagraph of this paragraph, insurance undertakings referred to in point (a) of the first subparagraph of Article 1(3) of this Directive that are part of a group, on the basis of financial relationships referred to in point (c)(ii) of Article 212(1) of Directive 2009/138/EC, and which are subject to group supervision in accordance with points (a) to (c) of Article 213(2) of that Directive shall be treated as subsidiary undertakings of the parent undertaking of that group.\n\n9.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
📚 Documentation
Model Details
Model Description
Property | Details |
---|---|
Model Type | Sentence Transformer |
Base model | [Snowflake/snowflake - arctic - embed - m - v1.5](https://huggingface.co/Snowflake/snowflake - arctic - embed - m - v1.5) |
Maximum Sequence Length | 512 tokens |
Output Dimensionality | 768 dimensions |
Similarity Function | Cosine Similarity |
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence - transformers)
- Hugging Face: [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence - transformers)
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Evaluation
Metrics
Information Retrieval
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.8225 |
cosine_accuracy@3 | 0.9526 |
cosine_accuracy@5 | 0.9725 |
cosine_accuracy@10 | 0.9873 |
cosine_precision@1 | 0.8225 |
cosine_precision@3 | 0.3175 |
cosine_precision@5 | 0.1945 |
cosine_precision@10 | 0.0987 |
cosine_recall@1 | 0.8225 |
cosine_recall@3 | 0.9526 |
cosine_recall@5 | 0.9725 |
cosine_recall@10 | 0.9873 |
cosine_ndcg@10 | 0.9141 |
cosine_mrr@10 | 0.8896 |
cosine_map@100 | 0.8903 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 29,911 training samples
- Columns:
sentence_0
andsentence_1
- Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 type string string details - min: 13 tokens
- mean: 41.63 tokens
- max: 252 tokens
- min: 4 tokens
- mean: 233.72 tokens
- max: 512 tokens
- Samples:
sentence_0 sentence_1 What measures must Member States take to ensure that workers who believe they have been discriminated against in terms of equal pay can establish their case before a competent authority or national court?
Article 18
Shift of burden of proof
1. Member States shall take the appropriate measures, in accordance with their national judicial systems, to ensure that, when workers who consider themselves wronged because the principle of equal pay has not been applied to them establish before a competent authority or national court facts from which it may be presumed that there has been direct or indirect discrimination, it shall be for the respondent to prove that there has been no direct or indirect discrimination in relation to pay.
2. Member States shall ensure that, in administrative procedures or court proceedings regarding alleged direct or indirect discrimination in relation to pay, where an employer has not implemented the pay transparency obligations set out in Articles 5, 6, 7, 9 and 10, it is for the employer to prove that there has been no such discrimination.
The first subparagraph of this paragraph shall not apply where the employer proves that the infringement of the obligati...What are the key considerations for recognizing and addressing discrimination in the context of compensation and penalties, particularly in relation to the gender pay gap?
discrimination, in particular for substantive and procedural purposes, including to recognise the existence of discrimination, to decide on the appropriate comparator, to assess the proportionality, and to determine, where relevant, the level of compensation awarded or penalties imposed. An intersectional approach is important for understanding and addressing the gender pay gap. This clarification should not change the scope of employers’ obligations in regard to the pay transparency measures under this Directive. In particular, employers should not be required to gather data related to protected grounds other than sex.
What is the process for aircraft operators and shipping companies regarding the surrendering of allowances in relation to their total emissions from the previous calendar year?
(b)
each aircraft operator surrenders a number of allowances that is equal to its total emissions during the preceding calendar year, as verified in accordance with Article 15;
(c)
each shipping company surrenders a number of allowances that is equal to its total emissions during the preceding calendar year, as verified in accordance with Article 3ge.
Member States, administering Member States and administering authorities in respect of a shipping company shall ensure that allowances surrendered in accordance with the first subparagraph are subsequently cancelled.
▼M15
3 - e. - Loss:
MatryoshkaLoss
with these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non - Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 6per_device_eval_batch_size
: 6num_train_epochs
: 4multi_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 6per_device_eval_batch_size
: 6per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e - 05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e - 08max_grad_norm
: 1num_train_epochs
: 4max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Click to expand
Epoch | Step | Training Loss | cosine_ndcg@10 |
---|---|---|---|
0.0201 | 100 | - | 0.6629 |
0.0401 | 200 | - | 0.7746 |
0.0602 | 300 | - | 0.8233 |
0.0802 | 400 | - | 0.8515 |
0.1003 | 500 | 0.4694 | 0.8621 |
0.1203 | 600 | - | 0.8680 |
0.1404 | 700 | - | 0.8733 |
0.1604 | 800 | - | 0.8774 |
0.1805 | 900 | - | 0.8757 |
0.2006 | 1000 | 0.1568 | 0.8795 |
0.2206 | 1100 | - | 0.8808 |
0.2407 | 1200 | - | 0.8789 |
0.2607 | 1300 | - | 0.8796 |
0.2808 | 1400 | - | 0.8822 |
0.3008 | 1500 | 0.1015 | 0.8821 |
0.3209 | 1600 | - | 0.8814 |
0.3410 | 1700 | - | 0.8756 |
0.3610 | 1800 | - | 0.8822 |
0.3811 | 1900 | - | 0.8848 |
0.4011 | 2000 | 0.0836 | 0.8843 |
0.4212 | 2100 | - | 0.8841 |
0.4412 | 2200 | - | 0.8803 |
0.4613 | 2300 | - | 0.8851 |
0.4813 | 2400 | - | 0.8818 |
0.5014 | 2500 | 0.0865 | 0.8849 |
0.5215 | 2600 | - | 0.8877 |
0.5415 | 2700 | - | 0.8806 |
0.5616 | 2800 | - | 0.8832 |
0.5816 | 2900 | - | 0.8930 |
0.6017 | 3000 | 0.0842 | 0.8928 |
0.6217 | 3100 | - | 0.8882 |
0.6418 | 3200 | - | 0.8858 |
0.6619 | 3300 | - | 0.8863 |
0.6819 | 3400 | - | 0.8828 |
0.7020 | 3500 | 0.0669 | 0.8839 |
0.7220 | 3600 | - | 0.8835 |
0.7421 | 3700 | - | 0.8854 |
0.7621 | 3800 | - | 0.8839 |
0.7822 | 3900 | - | 0.8882 |
0.8022 | 4000 | 0.0695 | 0.8871 |
0.8223 | 4100 | - | 0.8854 |
0.8424 | 4200 | - | 0.8822 |
0.8624 | 4300 | - | 0.8847 |
0.8825 | 4400 | - | 0.8863 |
0.9025 | 4500 | 0.0575 | 0.8819 |
0.9226 | 4600 | - | 0.8815 |
0.9426 | 4700 | - | 0.8836 |
0.9627 | 4800 | - | 0.8862 |
0.9828 | 4900 | - | 0.8889 |
1.0 | 4986 | - | 0.8927 |
1.0028 | 5000 | 0.0712 | 0.8935 |
1.0229 | 5100 | - | 0.8890 |
1.0429 | 5200 | - | 0.8919 |
1.0630 | 5300 | - | 0.8949 |
1.0830 | 5400 | - | 0.8950 |
1.1031 | 5500 | 0.0485 | 0.8934 |
1.1231 | 5600 | - | 0.8964 |
1.1432 | 5700 | - | 0.8953 |
1.1633 | 5800 | - | 0.8942 |
1.1833 | 5900 | - | 0.8929 |
1.2034 | 6000 | 0.0465 | 0.8912 |
1.2234 | 6100 | - | 0.8890 |
1.2435 | 6200 | - | 0.8914 |
1.2635 | 6300 | - | 0.8847 |
1.2836 | 6400 | - | 0.8873 |
1.3037 | 6500 | 0.0324 | 0.8912 |
1.3237 | 6600 | - | 0.8956 |
1.3438 | 6700 | - | 0.8954 |
1.3638 | 6800 | - | 0.8946 |
1.3839 | 6900 | - | 0.8931 |
1.4039 | 7000 | 0.0205 | 0.8951 |
1.4240 | 7100 | - | 0.8967 |
1.4440 | 7200 | - | 0.8960 |
1.4641 | 7300 | - | 0.8943 |
1.4842 | 7400 | - | 0.9003 |
1.5042 | 7500 | 0.0489 | 0.8946 |
1.5243 | 7600 | - | 0.8986 |
1.5443 | 7700 | - | 0.8945 |
1.5644 | 7800 | - | 0.8960 |
1.5844 | 7900 | - | 0.8987 |
1.6045 | 8000 | 0.039 | 0.8991 |
1.6245 | 8100 | - | 0.8959 |
1.6446 | 8200 | - | 0.8948 |
1.6647 | 8300 | - | 0.8933 |
1.6847 | 8400 | - | 0.8926 |
1.7048 | 8500 | 0.0297 | 0.8937 |
1.7248 | 8600 | - | 0.8974 |
1.7449 | 8700 | - | 0.8977 |
1.7649 | 8800 | - | 0.8973 |
1.7850 | 8900 | - | 0.8989 |
1.8051 | 9000 | 0.0248 | 0.8974 |
1.8251 | 9100 | - | 0.8980 |
1.8452 | 9200 | - | 0.8970 |
1.8652 | 9300 | - | 0.8997 |
1.8853 | 9400 | - | 0.9007 |
1.9053 | 9500 | 0.0534 | 0.9009 |
1.9254 | 9600 | - | 0.9015 |
1.9454 | 9700 | - | 0.9014 |
1.9655 | 9800 | - | 0.9008 |
1.9856 | 9900 | - | 0.9024 |
2.0 | 9972 | - | 0.9052 |
2.0056 | 10000 | 0.0295 | 0.9041 |
2.0257 | 10100 | - | 0.9009 |
2.0457 | 10200 | - | 0.9030 |
2.0658 | 10300 | - | 0.9028 |
2.0858 | 10400 | - | 0.9051 |
2.1059 | 10500 | 0.027 | 0.9063 |
2.1260 | 10600 | - | 0.9059 |
2.1460 | 10700 | - | 0.9044 |
2.1661 | 10800 | - | 0.9024 |
2.1861 | 10900 | - | 0.9005 |
2.2062 | 11000 | 0.0201 | 0.8996 |
2.2262 | 11100 | - | 0.9037 |
2.2463 | 11200 | - | 0.9029 |
2.2663 | 11300 | - | 0.9047 |
2.2864 | 11400 | - | 0.9030 |
2.3065 | 11500 | 0.0097 | 0.9041 |
2.3265 | 11600 | - | 0.9011 |
2.3466 | 11700 | - | 0.9000 |
2.3666 | 11800 | - | 0.8972 |
2.3867 | 11900 | - | 0.8985 |
2.4067 | 12000 | 0.0165 | 0.8979 |
2.4268 | 12100 | - | 0.8996 |
2.4469 | 12200 | - | 0.9026 |
2.4669 | 12300 | - | 0.9034 |
2.4870 | 12400 | - | 0.9054 |
2.5070 | 12500 | 0.0165 | 0.9029 |
2.5271 | 12600 | - | 0.9052 |
2.5471 | 12700 | - | 0.9057 |
2.5672 | 12800 | - | 0.9059 |
2.5872 | 12900 | - | 0.9092 |
2.6073 | 13000 | 0.0144 | 0.9081 |
2.6274 | 13100 | - | 0.9095 |
2.6474 | 13200 | - | 0.9102 |
2.6675 | 13300 | - | 0.9113 |
2.6875 | 13400 | - | 0.9103 |
2.7076 | 13500 | 0.0159 | 0.9105 |
2.7276 | 13600 | - | 0.9073 |
2.7477 | 13700 | - | 0.9084 |
2.7677 | 13800 | - | 0.9080 |
2.7878 | 13900 | - | 0.9083 |
2.8079 | 14000 | 0.0183 | 0.9083 |
2.8279 | 14100 | - | 0.9070 |
2.8480 | 14200 | - | 0.9085 |
2.8680 | 14300 | - | 0.9078 |
2.8881 | 14400 | - | 0.9075 |
2.9081 | 14500 | 0.0257 | 0.9073 |
2.9282 | 14600 | - | 0.9098 |
2.9483 | 14700 | - | 0.9089 |
2.9683 | 14800 | - | 0.9097 |
2.9884 | 14900 | - | 0.9079 |
3.0 | 14958 | - | 0.9081 |
3.0084 | 15000 | 0.0144 | 0.9084 |
3.0285 | 15100 | - | 0.9083 |
3.0485 | 15200 | - | 0.9078 |
3.0686 | 15300 | - | 0.9079 |
3.0886 | 15400 | - | 0.9089 |
3.1087 | 15500 | 0.0082 | 0.9093 |
3.1288 | 15600 | - | 0.9098 |
3.1488 | 15700 | - | 0.9106 |
3.1689 | 15800 | - | 0.9103 |
3.1889 | 15900 | - | 0.9110 |
3.2090 | 16000 | 0.0185 | 0.9117 |
3.2290 | 16100 | - | 0.9116 |
3.2491 | 16200 | - | 0.9125 |
3.2692 | 16300 | - | 0.9111 |
3.2892 | 16400 | - | 0.9109 |
3.3093 | 16500 | 0.0105 | 0.9125 |
3.3293 | 16600 | - | 0.9117 |
3.3494 | 16700 | - | 0.9118 |
3.3694 | 16800 | - | 0.9117 |
3.3895 | 16900 | - | 0.9137 |
3.4095 | 17000 | 0.019 | 0.9134 |
3.4296 | 17100 | - | 0.9129 |
3.4497 | 17200 | - | 0.9126 |
3.4697 | 17300 | - | 0.9133 |
3.4898 | 17400 | - | 0.9136 |
3.5098 | 17500 | 0.0109 | 0.9120 |
3.5299 | 17600 | - | 0.9124 |
3.5499 | 17700 | - | 0.9122 |
3.5700 | 17800 | - | 0.9129 |
3.5901 | 17900 | - | 0.9132 |
3.6101 | 18000 | 0.0207 | 0.9139 |
3.6302 | 18100 | - | 0.9134 |
3.6502 | 18200 | - | 0.9135 |
3.6703 | 18300 | - | 0.9139 |
3.6903 | 18400 | - | 0.9141 |
3.7104 | 18500 | 0.0105 | 0.9139 |
3.7304 | 18600 | - | 0.9138 |
3.7505 | 18700 | - | 0.9136 |
3.7706 | 18800 | - | 0.9141 |
Framework Versions
- Python: 3.10.11
- Sentence Transformers: 3.4.1
- Transformers: 4.48.1
- PyTorch: 2.4.0+cu121
- Accelerate: 1.4.0
- Datasets: 3.3.2
- Tokenizers: 0.21.0
📄 License
The model is based on the Sentence Transformers framework. For the license information of Sentence Transformers, please refer to its official documentation.
📖 Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard - Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al - Rfou and Brian Strope and Yun - hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Jina Embeddings V3
Jina Embeddings V3 is a multilingual sentence embedding model supporting over 100 languages, specializing in sentence similarity and feature extraction tasks.
Text Embedding
Transformers Supports Multiple Languages

J
jinaai
3.7M
911
Ms Marco MiniLM L6 V2
Apache-2.0
A cross-encoder model trained on the MS Marco passage ranking task for query-passage relevance scoring in information retrieval
Text Embedding English
M
cross-encoder
2.5M
86
Opensearch Neural Sparse Encoding Doc V2 Distill
Apache-2.0
A sparse retrieval model based on distillation technology, optimized for OpenSearch, supporting inference-free document encoding with improved search relevance and efficiency over V1
Text Embedding
Transformers English

O
opensearch-project
1.8M
7
Sapbert From PubMedBERT Fulltext
Apache-2.0
A biomedical entity representation model based on PubMedBERT, optimized for semantic relation capture through self-aligned pre-training
Text Embedding English
S
cambridgeltl
1.7M
49
Gte Large
MIT
GTE-Large is a powerful sentence transformer model focused on sentence similarity and text embedding tasks, excelling in multiple benchmark tests.
Text Embedding English
G
thenlper
1.5M
278
Gte Base En V1.5
Apache-2.0
GTE-base-en-v1.5 is an English sentence transformer model focused on sentence similarity tasks, excelling in multiple text embedding benchmarks.
Text Embedding
Transformers Supports Multiple Languages

G
Alibaba-NLP
1.5M
63
Gte Multilingual Base
Apache-2.0
GTE Multilingual Base is a multilingual sentence embedding model supporting over 50 languages, suitable for tasks like sentence similarity calculation.
Text Embedding
Transformers Supports Multiple Languages

G
Alibaba-NLP
1.2M
246
Polybert
polyBERT is a chemical language model designed to achieve fully machine-driven ultrafast polymer informatics. It maps PSMILES strings into 600-dimensional dense fingerprints to numerically represent polymer chemical structures.
Text Embedding
Transformers

P
kuelumbus
1.0M
5
Bert Base Turkish Cased Mean Nli Stsb Tr
Apache-2.0
A sentence embedding model based on Turkish BERT, optimized for semantic similarity tasks
Text Embedding
Transformers Other

B
emrecan
1.0M
40
GIST Small Embedding V0
MIT
A text embedding model fine-tuned based on BAAI/bge-small-en-v1.5, trained with the MEDI dataset and MTEB classification task datasets, optimized for query encoding in retrieval tasks.
Text Embedding
Safetensors English
G
avsolatorio
945.68k
29
Featured Recommended AI Models