GPT2-774M-Fineweb-150B Open Source Model - A Practical Tool with Excellent Research Performance and Trained on Massive Data

Gpt2 774M Fineweb 150B

Developed by rhysjones

This model originates from karpathy's llm.c project, converted to HuggingFace format for researching bfloat16 performance, with a training process consuming 150 billion tokens.

Large Language Model

Transformers

Open Source License:MIT #bfloat16 optimization #lightweight LLM #high-performance inference

Downloads 22

Release Time : 4/25/2025

Model Overview

This model is a language model based on the llm.c project, primarily used for researching bfloat16 performance optimization and trained on a 100 billion FineWeb sample dataset.

Model Features

bfloat16 performance research

This model is specifically designed for researching performance optimization of the bfloat16 data type.

Large-scale training

Trained for 1.5 epochs on a 100 billion FineWeb sample dataset, consuming 150 billion tokens.

Active development

Currently under active development; follow the llm.c project for the latest updates.

Model Capabilities

Language model training

Performance optimization research

Use Cases

Research

bfloat16 performance research

Research the performance of the bfloat16 data type in language model training.

Large-scale language model training

Explore methods for training language models on large-scale datasets.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

Gpt2 774M Fineweb 150B

Model Overview

Model Features

Model Capabilities

Use Cases

🚀 Karpathy's Model in HF Format

🚀 Quick Start

📄 License

📦 Datasets