B

Bert Base 1024 Biencoder 64M Pairs

Developed by shreyansh26
A long-context bi-encoder based on MosaicML's pre-trained BERT with 1024 sequence length, for sentence and paragraph embeddings
Downloads 19
Release Time : 8/22/2023

Model Overview

This model maps sentences and paragraphs into a 768-dimensional dense vector space, suitable for tasks like clustering or semantic search.

Model Features

Long-context support
Supports 1024 sequence length, suitable for processing long documents and paragraphs
Large-scale training
Trained on 64M randomly sampled sentence/paragraph pairs
Efficient retrieval
Optimized for semantic search and information retrieval tasks

Model Capabilities

Sentence embeddings
Paragraph embeddings
Semantic similarity computation
Information retrieval
Document clustering

Use Cases

Information retrieval
Semantic search
Building semantic retrieval functionality for search engines
Performs well on multiple retrieval benchmarks
Question answering systems
Used to retrieve the most relevant document passages for questions
Text analysis
Document clustering
Grouping documents with similar content
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase