S

Scincl

Developed by malteos
SciNCL is a pre-trained BERT language model for generating document-level embedding representations of research papers, trained using contrastive learning with citation graph neighborhood relationships.
Downloads 6,744
Release Time : 3/2/2022

Model Overview

This model is specifically designed for generating embedding representations of scientific literature, optimizing document-level semantic representations through contrastive learning, suitable for academic paper similarity computation and recommendation systems.

Model Features

Citation Graph Enhanced Training
Utilizes neighborhood relationships from the S2ORC citation graph to generate contrastive learning samples, improving document representation quality.
Scientific Domain Optimization
Designed specifically for scientific literature, demonstrating excellent performance on the SciDocs evaluation benchmark.
Dual Text Encoding
Supports joint encoding of titles and abstracts (connected via [SEP] token).

Model Capabilities

Scientific literature embedding representation generation
Document similarity computation
Academic paper recommendation

Use Cases

Academic Research
Related Paper Discovery
Find research literature related to a given paper through embedding similarity.
Achieved 93.6 map in citation relation tasks on the SciDocs evaluation.
Academic Recommendation System
Build a content-based paper recommendation system.
Achieved 54.3 ndcg in recommendation tasks.
Literature Analysis
Research Trend Analysis
Analyze disciplinary development trends through large-scale literature embedding clustering.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase