đ A Pythia Chat Model of 31M Parameters
This is a text - generation model based on the EleutherAI/pythia - 31m base model. It provides different ML formats and has been trained on multiple datasets.
đ Quick Start
Model Information
Recommended prompt format
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{user_message}<|im_end|>
<|im_start|>assistant
Recommended inference parameters
penalty_alpha: 0.5
top_k: 2
repetition_penalty: 1.0016
⨠Features
This model can be used for text generation tasks. The widget example shows its application in different scenarios such as career counseling, answering questions about quantum computing applications, and providing advice on health improvement.
đĻ Installation
No specific installation steps are provided in the original document, so this section is skipped.
đģ Usage Examples
Basic Usage
The widget examples demonstrate basic usage scenarios:
widget = {
"messages": [
{
"role": "system",
"content": "You are a career counselor. The user will provide you with an individual looking for guidance in their professional life, and your task is to assist them in determining what careers they are most suited for based on their skills, interests, and experience. You should also conduct research into the various options available, explain the job market trends in different industries, and advice on which qualifications would be beneficial for pursuing particular fields."
},
{
"role": "user",
"content": "Heya!"
},
{
"role": "assistant",
"content": "Hi! How may I help you?"
},
{
"role": "user",
"content": "I am interested in developing a career in software engineering. What would you recommend me to do?"
},
]
}
Advanced Usage
The following code shows the training process of the model:
SFTTrainer(
model,
train_dataset=train_dataset,
dataset_text_field="text",
eval_dataset=eval_dataset,
max_seq_length=2048,
packing=True,
args=TrainingArguments(
learning_rate=2e-6,
per_device_train_batch_size=1,
per_device_eval_batch_size=1,
gradient_accumulation_steps=16,
lr_scheduler_type="cosine",
num_train_epochs=1,
logging_strategy="steps",
save_strategy="steps",
evaluation_strategy="steps",
logging_steps=10,
eval_steps=10,
save_steps=10,
warmup_steps=50,
load_best_model_at_end=True,
metric_for_best_model="eval_loss",
greater_is_better=False,
weight_decay=0.01,
save_total_limit=10,
neftune_noise_alpha=5,
),
callbacks=[
EarlyStoppingCallback(
early_stopping_patience=3,
early_stopping_threshold=0.005
),
],
)
DPOTrainer(
model,
beta=0.1,
train_dataset=dataset,
tokenizer=tokenizer,
eval_dataset=eval_dataset,
max_length=1536,
max_prompt_length=1024,
args=TrainingArguments(
learning_rate=2e-6,
per_device_train_batch_size=1,
per_device_eval_batch_size=1,
gradient_accumulation_steps=1,
lr_scheduler_type="cosine",
num_train_epochs=1,
logging_strategy="steps",
save_strategy="steps",
evaluation_strategy="steps",
logging_steps=1,
eval_steps=1,
save_steps=1,
warmup_steps=0,
load_best_model_at_end=True,
metric_for_best_model="eval_loss",
greater_is_better=False,
weight_decay=0.0,
neftune_noise_alpha=5,
remove_unused_columns=False,
),
callbacks=[
EarlyStoppingCallback(
early_stopping_patience=3,
early_stopping_threshold=0.005
),
],
)
đ Documentation
Datasets and parameters used for training
Detailed results can be found here
Metric |
Value |
Avg. |
19.92 |
AI2 Reasoning Challenge (25 - Shot) |
22.70 |
HellaSwag (10 - Shot) |
25.60 |
MMLU (5 - Shot) |
23.24 |
TruthfulQA (0 - shot) |
0.00 |
Winogrande (5 - shot) |
47.99 |
GSM8k (5 - shot) |
0.00 |
đ License
The model is licensed under the Apache - 2.0 license.