Arsh Llm Gpt
A large language model developed based on the GPT-2 architecture, focusing on research assistance functions, trained under limited hardware conditions
Downloads 19
Release Time : 5/14/2025
Model Overview
The Arsh Large Language Model is a research assistance project developed using the GPT-2 architecture. It was trained under limited hardware conditions through a phased training strategy, aiming to prove that large models do not necessarily require top-tier hardware support
Model Features
Limited Hardware Training
Model training completed on a T4 GPU through a phased training strategy, with each phase taking 1-2 days
Multi-stage Training
The training process is divided into 8 phases, taking approximately 4-5 days in total, achieving efficient training
Mixed Dataset
Trained using the olmo-mix-1124 dataset and fine-tuned with multiple open-source conversation datasets
Model Capabilities
Text Generation
Research Assistance
Use Cases
Research
Research Literature Assistance
Assisting researchers in literature analysis and content generation
Featured Recommended AI Models
Š 2025AIbase