L

Lodestone Base 4096 V1

Developed by Hum-Works
A sentence-transformers model developed by Hum, supporting 4096 tokens long text embedding, suitable for semantic search and clustering tasks
Downloads 132
Release Time : 8/25/2023

Model Overview

An innovative long-text encoder based on Transformer architecture, integrating FlashAttention, ALiBi, and GLU technologies, capable of mapping sentences and paragraphs to a 768-dimensional vector space

Model Features

Ultra-long Context Support
Extended to 4096 tokens input length via ALiBi technology, suitable for processing long documents
Efficient Attention Mechanism
Integrated FlashAttention optimizes computational efficiency, supporting automatic invocation of Triton high-performance implementation
Lightweight Design
Can run on GPU/CPU, balancing performance and resource consumption
Multi-source Training Data
Fine-tuned on 1.5 billion sentence pairs from multiple domains (academic, Q&A, community discussions, etc.)

Model Capabilities

Text Vectorization
Semantic Similarity Calculation
Information Retrieval
Text Clustering

Use Cases

Knowledge Management
Academic Literature Retrieval
Embeddings trained on S2ORC data can be used for paper recommendation systems
Community Content Processing
Q&A Pair Matching
Identify similar questions on platforms like StackExchange
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase