M

Mamba 7b Rw

Developed by TRI-ML
Mamba-7B is a 7-billion-parameter model based on the Mamba architecture, trained over multiple rounds on the RefinedWeb dataset (1.2 trillion tokens). Mamba is a state space model that does not use self-attention mechanisms and excels in various natural language benchmarks.
Downloads 188
Release Time : 4/8/2024

Model Overview

Mamba-7B is an autoregressive language model based on the Mamba architecture, designed for text generation tasks. It was trained on the RefinedWeb dataset with 1.2 trillion tokens and supports the English language.

Model Features

Mamba Architecture
Mamba is a state space model that does not use self-attention mechanisms, featuring linear time complexity and efficient inference capabilities.
Large-Scale Training Data
Trained on the RefinedWeb dataset with 1.2 trillion tokens, covering a wide range of natural language tasks.
Efficient Inference
Due to the characteristics of the Mamba architecture, the model exhibits high efficiency and low computational costs during inference.

Model Capabilities

Text Generation
Natural Language Understanding
Question Answering

Use Cases

Natural Language Processing
Text Generation
Generates coherent and contextually relevant text, suitable for content creation, dialogue systems, etc.
The generated text exhibits high coherence and relevance.
Question Answering
Answers user queries, applicable in customer service, education, and other fields.
Achieves 33.3% accuracy on the MMLU dataset.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase