S

Scitopicnomicembed

Developed by Corran
A sentence transformer model fine-tuned based on nomic-ai/nomic-embed-text-v1.5, optimized for scientific literature topic similarity tasks
Downloads 114
Release Time : 2/2/2025

Model Overview

This model maps sentences and paragraphs into a 768-dimensional dense vector space, suitable for tasks such as semantic text similarity, semantic search, and paraphrase mining, with special optimization for scientific literature topic analysis.

Model Features

Long Text Processing Capability
Supports sequences up to 8192 tokens, ideal for handling long paragraphs in scientific literature
Scientific Topic Optimization
Fine-tuned on the SciTopicTriplets dataset, particularly adept at analyzing topic similarity in scientific literature
Multi-level Embeddings
Trained with MatryoshkaLoss, capable of generating multi-level embeddings of 768/384/256/128/64 dimensions

Model Capabilities

Semantic Text Similarity Calculation
Scientific Literature Topic Matching
Semantic Search
Text Clustering
Feature Extraction

Use Cases

Academic Research
Literature Recommendation System
Recommends relevant literature to researchers based on content similarity
Achieved a normalized discounted cumulative gain of 0.5664 on the SciGen evaluation set
Research Topic Analysis
Identifies and clusters related topics in scientific literature
Information Retrieval
Scientific Literature Retrieval
Enhances semantic search functionality in scientific databases
Achieved a precision@10 score of 0.9893
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase