D

Dense Encoder Msmarco Distilbert Word2vec256k

Developed by vocab-transformers
A sentence encoder based on msmarco-word2vec256000-distilbert-base-uncased, using a word2vec-initialized 256k vocabulary, specifically designed for sentence similarity tasks
Downloads 38
Release Time : 3/2/2022

Model Overview

This model is a sentence transformer primarily used for feature extraction and sentence similarity calculation. It was trained on the MS MARCO dataset using MarginMSELoss and is suitable for scenarios like information retrieval.

Model Features

Word2vec-initialized vocabulary
Uses a 256k vocabulary initialized with word2vec, potentially providing better word vector representations
Frozen word embeddings training
The word embedding matrix is frozen during training to preserve the characteristics of pre-trained word vectors
MarginMSELoss training
Trained using MarginMSELoss to optimize the similarity relationships between sentence pairs

Model Capabilities

Sentence feature extraction
Calculate sentence similarity
Information retrieval

Use Cases

Information retrieval
Document retrieval
Can be used to build search engines that return relevant results based on semantic similarity between queries and documents
Question answering systems
Can be used to match user questions with candidate answers in a knowledge base
Semantic matching
Duplicate question detection
Identify differently phrased but semantically similar questions
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase