M

M2 Bert 80M 2k Retrieval

Developed by togethercomputer
This is an 80M parameter M2-BERT pre-trained checkpoint with a sequence length of 2048, fine-tuned for long-context retrieval tasks.
Downloads 538
Release Time : 11/13/2023

Model Overview

The Monarch Mixer-BERT model is a GEMM-based sub-quadratic architecture specifically optimized for long-context retrieval tasks, capable of generating high-quality embedding vectors for information retrieval.

Model Features

Long sequence processing capability
Supports sequences up to 2048 in length, suitable for processing long text content
Efficient retrieval
Specifically optimized for retrieval tasks, capable of generating high-quality 768-dimensional embedding vectors
Sub-quadratic architecture
Utilizes Monarch Mixer architecture, achieving efficient computation based on GEMM

Model Capabilities

Long-text embedding generation
Sentence similarity calculation
Information retrieval

Use Cases

Information retrieval
Document retrieval
Can be used to build document retrieval systems to find relevant documents based on query content
Semantic search
Supports search functionality based on semantics rather than keywords
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase