C

Cerebras GPT 13B

Developed by cerebras
Cerebras-GPT 13B is a large language model trained based on an open architecture and dataset. It belongs to the Cerebras-GPT series and aims to study the scaling laws of large language models and demonstrate the simplicity and scalability of training on the Cerebras software and hardware stack.
Downloads 669
Release Time : 3/20/2023

Model Overview

Cerebras-GPT 13B is a large language model based on the Transformer architecture, mainly used for natural language processing tasks such as text generation and understanding. It is trained following the Chinchilla scaling law and has high computational efficiency.

Model Features

Rich model family
The Cerebras-GPT family includes models of various scales from 111M to 13B to meet different computational needs.
Follow the scaling law
All models are trained according to the Chinchilla scaling law (20 tokens per model parameter) to achieve optimal computation.
Efficient training architecture
Trained on the Andromeda AI supercomputer, using Cerebras' weight streaming technology to achieve efficient training expansion through simple data parallelism.

Model Capabilities

Text generation
Natural language understanding
Zero-shot learning
Five-shot learning

Use Cases

Research
Research on the scaling laws of large language models
Used to study the scaling laws of large language models and verify the computationally optimal training method.
Natural language processing
Text generation
Used to generate coherent text content, such as articles, conversations, etc.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase