Gpt2 774M Fineweb 150B
This model originates from karpathy's llm.c project, converted to HuggingFace format for researching bfloat16 performance, with a training process consuming 150 billion tokens.
Downloads 22
Release Time : 4/25/2025
Model Overview
This model is a language model based on the llm.c project, primarily used for researching bfloat16 performance optimization and trained on a 100 billion FineWeb sample dataset.
Model Features
bfloat16 performance research
This model is specifically designed for researching performance optimization of the bfloat16 data type.
Large-scale training
Trained for 1.5 epochs on a 100 billion FineWeb sample dataset, consuming 150 billion tokens.
Active development
Currently under active development; follow the llm.c project for the latest updates.
Model Capabilities
Language model training
Performance optimization research
Use Cases
Research
bfloat16 performance research
Research the performance of the bfloat16 data type in language model training.
Large-scale language model training
Explore methods for training language models on large-scale datasets.
Featured Recommended AI Models