🚀 BrtGPT-124M-Base
This is a base model pre-trained on a large corpus with English sentences. It's trained on 5M tokens and is not for QA.
⚠️ Important Note
MODEL TRAINED ON 5M TOKENS AGAIN IN 14 JUNE 2025, IF YOU DOWNLOADED WEIGHTS BEFORE; REINSTALL!
🚀 Quick Start
You can use this model to generate English texts. It's free to use. Here is an example of how to load the model:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_name = "Bertug1911/BrtGPT-124m-Base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
prompt = "Math is so important because"
inputs = tokenizer(prompt, return_tensors="pt")
output = model.generate(
**inputs,
max_new_tokens=50,
temperature=0.01,
top_k=1,
do_sample=True
)
generated_text = tokenizer.decode(output[0], skip_special_tokens=False)
generated_text = generated_text.replace(" ", "")
generated_text = generated_text.replace("Ġ", " ")
print(generated_text)
✨ Features
- Trained on 5M Tokens: The model is trained on about 5 million English tokens.
- Decoder-Only Transformer: It uses a decoder-only transformer architecture.
- Free and Easy to Use: You can use and download the model for free and easily from the provided link.
📦 Installation
No specific installation steps are provided in the original document.
💻 Usage Examples
Basic Usage
Advanced Usage
The original document doesn't provide advanced usage examples.
📚 Documentation
Model Details
Property |
Details |
Developed by |
Bertug Gunel (Bertuğ Günel) |
Funded by |
Nobody |
Shared by |
Nobody |
Model Type |
Decoder-Only Transformer |
Language(s) (NLP) |
English |
License |
CC-BY-NC-4.0 |
Finetuned from model |
Not Fine-Tuned |
Model Sources
- Repository: Cooming Soon!
- Paper: "Attention All You Need", 1706.03762
- Demo: Model is already a demo model.
Uses
Direct Use
The worst part of open source models is usually the laborious usage and high processing power requirement. However, our model solves both problems. You can use and download it for FREE and VERY EASILY from the link below!
Web (Gradio, Spaces) UI is ready! To use it FREELY and EASILY, visit Hugging Face Spaces LINK: "https://huggingface.co/spaces/Bertug1911/BrtGPT-Web-UI"
Out-of-Scope Use
The model only generates (completes) "English" sentences with "English" tokens. (And contains some Japanese/Chinese tokens.) Don't try with other languages!
Bias, Risks, and Limitations
The model can generate "Political" contents. Use it at your own risk.
Recommendations
There are no big risks or biases! You can use it freely (But only "Non-commerical").
🔧 Technical Details
Training Details
Data Type |
Training Type |
Tokens (Total) |
Status |
Raw (sentences) |
Pre Training |
About 5M (5000K) |
FINISHED |
Raw (sentences) |
Fine-Tuning (For upgrade model performance on tests and usage!) |
About 0.1M (100K) |
FINISHED ON 17 JUNE |
Instruction (Coming soon!) |
Instruction Tuning (IFT) |
Cooming soon! |
SOON! (Probably; in July (5 - 15)) |
NOTE: MODEL TRAINED ON 5M TOKENS AGAIN IN 13 JUNE 2025, IF YOU DOWNLOADED WEIGHTS BEFORE; REINSTALL!
FINE-TUNING DETAILS:
You can access the fine-tuned model with this link: "https://huggingface.co/Bertug1911/BrtGPT-124m-FineTuned" NOTE: The fine-tuned model has "model.safetensors", this file contains weights (Fine-tuned from this model.)
Training Procedure
The model is trained on a B200 GPU (Nvidia) for 21.5 minutes.
Training Hyperparameters
- Training regime: Training precision is: FP16, Sparsity: OFF
Evaluation
NO EVALUATION (Cooming soon!)
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
Property |
Details |
Hardware Type |
GPU |
Hours used |
0.35 (21.5 Mins) |
Cloud Provider |
"Runpod" (https://www.runpod.io/) |
Compute Region |
EU |
Carbon Emitted |
0.138 KG (138 Gram(s)) "This is equivalent to an average light bulb burning for 2.6 hours." |
📄 License
The model is licensed under CC-BY-NC-4.0.
Model Card Authors
- Bertug Gunel
- Turkey/Eskisehir
Model Card Contact
bertugscpmail@gmail.com