M

M2 Bert 80M 8k Retrieval

Developed by togethercomputer
This is an 80-million-parameter M2-BERT pre-trained checkpoint with a sequence length of 8192, fine-tuned for long-context retrieval tasks.
Downloads 198
Release Time : 11/4/2023

Model Overview

The Monarch Mixer-BERT model is a simple GEMM-based sub-quadratic complexity architecture designed for long-context retrieval tasks.

Model Features

Long Sequence Processing
Supports sequences up to 8192 in length, making it suitable for long-context retrieval tasks.
Efficient Architecture
A simple GEMM-based sub-quadratic complexity architecture with high computational efficiency.
Pre-training and Fine-tuning
Pre-trained and fine-tuned for retrieval tasks, generating 768-dimensional retrieval embeddings.

Model Capabilities

Sentence Similarity Calculation
Long Text Retrieval
Embedding Generation

Use Cases

Information Retrieval
Document Retrieval
Used for retrieving relevant documents from a large corpus.
Question Answering Systems
Used for retrieving relevant answers in question-answering systems.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase