D

Distilbert Dot Margin Mse T2 Msmarco

Developed by sebastian-hofstaetter
DistilBERT-based dense retrieval model trained with knowledge distillation, suitable for passage re-ranking and direct retrieval tasks
Downloads 99
Release Time : 3/2/2022

Model Overview

This model adopts a 6-layer DistilBERT architecture, trained on the MSMARCO-Passage dataset using the Margin-MSE method, with shared query and passage encoding layers, and employs CLS vector pooling for representation.

Model Features

Knowledge Distillation Training
Utilizes an ensemble of 3 BERT_Cat teacher models for efficient knowledge distillation via the Margin-MSE method
Shared Encoding Architecture
Queries and passages share the same BERT layers, improving performance while reducing memory requirements
Lightweight Design
Based on 6-layer DistilBERT, suitable for deployment on consumer-grade GPUs

Model Capabilities

Passage Retrieval
Candidate Set Re-ranking
Semantic Similarity Calculation

Use Cases

Information Retrieval
Search Engine Result Re-ranking
Re-ranks top-1000 results from traditional retrieval methods like BM25
Achieves MRR@10 of 0.332 on MSMARCO-DEV
Direct Dense Retrieval
Direct passage retrieval based on vector indexing
Achieves Recall@1K of 0.957 on MSMARCO-DEV
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase