sentence-t5-large-quora-text-similarity Open Source Model - Free Implementation of Sentence and Paragraph Clustering and Semantic Search

Sentence T5 Large Quora Text Similarity

Developed by DrishtiSharma

This is a model based on sentence-transformers that maps sentences and paragraphs into a 768-dimensional dense vector space, suitable for tasks such as clustering and semantic search.

Text Embedding

PyTorch

#Sentence Vectorization #Semantic Search #768-Dimensional Embedding

Downloads 103

Release Time : 9/3/2023

Model Overview

This model is primarily used for the vectorized representation of sentences and paragraphs, capable of generating high-quality semantic embedding vectors, suitable for natural language processing tasks such as information retrieval and text similarity calculation.

Model Features

High-Quality Sentence Embeddings

Capable of generating 768-dimensional high-quality sentence embedding vectors that capture semantic information of sentences.

Semantic Similarity Calculation

Specially optimized for calculating semantic similarity between sentences.

Easy Integration

Can be easily integrated into existing systems through the sentence-transformers library.

Model Capabilities

Sentence vectorization

Semantic similarity calculation

Text feature extraction

Information retrieval

Text clustering

Use Cases

Information Retrieval

Semantic Search

Using sentence embeddings to improve the semantic understanding capability of search engines.

Enhances the relevance of search results.

Text Analysis

Document Clustering

Automatically grouping documents based on semantic similarity.

Achieves unsupervised document classification.

🚀 {MODEL_NAME}

This is a sentence-transformers model that maps sentences and paragraphs to a 768-dimensional dense vector space. It can be used for tasks such as clustering or semantic search.

🚀 Quick Start

Using this model becomes easy when you have sentence-transformers installed. First, install the sentence-transformers library:

pip install -U sentence-transformers

Then you can use the model like this:

from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]

model = SentenceTransformer('{MODEL_NAME}')
embeddings = model.encode(sentences)
print(embeddings)

✨ Features

Maps sentences & paragraphs to a 768 dimensional dense vector space.
Can be used for tasks like clustering or semantic search.

📦 Installation

To use this model, you need to install the sentence-transformers library:

pip install -U sentence-transformers

💻 Usage Examples

Basic Usage

from sentence_transformers import SentenceTransformer
sentences = ["This is an example sentence", "Each sentence is converted"]

model = SentenceTransformer('{MODEL_NAME}')
embeddings = model.encode(sentences)
print(embeddings)

📚 Documentation

Evaluation Results

For an automated evaluation of this model, see the Sentence Embeddings Benchmark: https://seb.sbert.net

Training

The model was trained with the parameters:

DataLoader: torch.utils.data.dataloader.DataLoader of length 35381 with parameters:

{'batch_size': 8, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}

Loss: sentence_transformers.losses.OnlineContrastiveLoss.OnlineContrastiveLoss

Parameters of the fit()-Method:

{
    "epochs": 3,
    "evaluation_steps": 0,
    "evaluator": "sentence_transformers.evaluation.BinaryClassificationEvaluator.BinaryClassificationEvaluator",
    "max_grad_norm": 1,
    "optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
    "optimizer_params": {
        "lr": 2e-05
    },
    "scheduler": "WarmupLinear",
    "steps_per_epoch": null,
    "warmup_steps": 20,
    "weight_decay": 0.01
}

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 40, 'do_lower_case': False}) with Transformer model: T5EncoderModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
  (2): Dense({'in_features': 1024, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (3): Normalize()
)

📄 License

No license information provided in the original document.

🔧 Technical Details

The model is a sentence-transformers model that maps sentences and paragraphs to a 768 dimensional dense vector space. It was trained with specific parameters for the DataLoader, loss function, and fit method, as described in the Training section. The full model architecture is also provided, showing the different components and their configurations.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご