C

Cerebras GPT 590M

Developed by cerebras
Cerebras-GPT 590M is a language model based on the Transformer architecture and belongs to the Cerebras-GPT model family. It aims to study the scaling laws of large language models and demonstrate the simplicity and scalability of training large language models on the Cerebras software and hardware stack.
Downloads 2,430
Release Time : 3/20/2023

Model Overview

Cerebras-GPT 590M is a GPT-3 style language model with a parameter scale of 590M, mainly used for natural language processing tasks such as text generation and language understanding.

Model Features

Optimal computational training
Trained according to the Chinchilla scaling law, with 20 tokens per model parameter to achieve optimal computation
Efficient training architecture
Trained on the Andromeda AI supercomputer, leveraging Cerebras' weight streaming technology for efficient scaling
Rich model family
Offers model choices ranging from 111M to 13B in different scales to meet different computational requirements

Model Capabilities

Text generation
Language understanding
Zero-shot learning
Five-shot learning

Use Cases

Research
Research on the scaling laws of large language models
Used to study the performance scaling laws of language models of different scales
Natural language processing
Text generation
Generate coherent English text
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase