M

Medcpt Article Encoder

Developed by ncbi
MedCPT is a model capable of generating biomedical text embeddings, particularly suitable for semantic search (dense retrieval) tasks.
Downloads 14.37k
Release Time : 10/24/2023

Model Overview

MedCPT includes two encoders: a query encoder and an article encoder. This model is the article encoder, used to compute embeddings for biomedical articles (such as PubMed titles and abstracts).

Model Features

Large-scale Pre-training
Pre-trained on 255 million query-article pairs from PubMed search logs
Excellent Zero-shot Performance
Achieves state-of-the-art performance on multiple zero-shot biomedical information retrieval datasets
Dual-encoder Architecture
Includes independent query encoder and article encoder for different scenarios
Pre-computed Embeddings Available
Pre-computed embeddings for all PubMed articles are publicly available

Model Capabilities

Biomedical Text Embedding Generation
Semantic Similarity Calculation
Zero-shot Information Retrieval
Text Clustering

Use Cases

Information Retrieval
PubMed Article Search
Using query encoder and article encoder for query-to-article search
Excellent performance in biomedical information retrieval tasks
Text Analysis
Article Clustering
Using article embeddings generated by the article encoder for clustering similar articles
Query Analysis
Using query embeddings generated by the query encoder for query intent analysis
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase