A

Arsh Llm

Developed by arshiaafshani
Arsh LLM is an open-source large language model designed for research, pretrained on the olmo mixed dataset using a T4 GPU, with a total training time of approximately 4-5 days.
Downloads 162
Release Time : 4/23/2025

Model Overview

This project aims to demonstrate that large models do not necessarily require top-tier hardware, achieving efficient development through optimized architectural design and phased training. The current version is an initial iteration and requires further training.

Model Features

Hardware-Friendly Training
Training completed on consumer-grade T4 GPUs, reducing hardware barriers through a phased training strategy (8 parts, each taking 1-2 days).
Mixed Dataset Training
Combines PILE dataset pretraining for stable model performance, followed by main training using the olmo-mix-1124 dataset.
Open-Source Architecture Design
References Gpt-neox and Llama technical documentation, incorporating AI-assisted design for optimized architecture (effectiveness pending verification).

Model Capabilities

Text Generation
Research Assistance

Use Cases

Research Field
Literature Assistance Generation
Helps researchers quickly generate draft papers or technical documents
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
Š 2025AIbase