M

Molm 700M 4B

Developed by ibm-research
MoLM is a series of language models based on the Mixture of Experts (MoE) architecture. The 700M-4B version has a total of 4 billion parameters, with computational consumption equivalent to a dense model of 700 million parameters.
Downloads 36
Release Time : 9/13/2023

Model Overview

The MoLM series of language models adopt the Mixture of Experts architecture, maintaining high parameter counts while reducing computational consumption through dynamic activation mechanisms, suitable for text generation and understanding tasks.

Model Features

Efficient Computing Architecture
Balances high parameter capacity with low computational consumption through Mixture of Experts design.
Modular Inference
Only activates a subset of expert modules per token (this model activates 4 modules).
Large-scale Pretraining
Trained on 300 billion tokens of public data.

Model Capabilities

Text Generation
Language Understanding
Question Answering System

Use Cases

Knowledge Q&A
Open-domain Q&A
Answers various common-sense questions
Achieves 16.49% accuracy in five-shot testing on TriviaQA.
Code Generation
Python Code Completion
Generates Python code snippets based on descriptions
Achieves 20.27% pass rate @100 in HumanEval testing.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase