🚀 Model Card for acecalisto3/PhiCo-D-Instruck
This model card provides a comprehensive overview of the acecalisto3/PhiCo-D-Instruck
model, a 🤗 transformers model accessible on the Hugging Face Model Hub. It details the model's key features, usage scenarios, training process, evaluation results, and more.
✨ Features
- Instruction Following: Capable of generating responses based on given context and instructions.
- Fine-Tuned for Specific Task: Adapted from the
t5-base
model for InstrucText's instruction following task.
- Seq2seq Architecture: With 12 layers, 768 hidden units, and 12 attention heads.
- Multifaceted Applications: Can be fine - tuned for downstream tasks like code generation and dialogue systems.
📦 Installation
Since it's a Hugging Face model, you can install the necessary libraries using pip
:
pip install transformers
💻 Usage Examples
Basic Usage
from transformers import T5ForConditionalGeneration, T5Tokenizer
model = T5ForConditionalGeneration.from_pretrained("acecalisto3/PhiCo-D-Instruck")
tokenizer = T5Tokenizer.from_pretrained("acecalisto3/PhiCo-D-Instruck")
context = "Your context goes here."
instructions = "Your instructions go here."
inputs = tokenizer.encode(f"{context} {instructions}", return_tensors="pt")
outputs = model.generate(inputs, max_length=50, num_beams=5, early_stopping=True)
response = tokenizer.decode(outputs[0])
print(response)
📚 Documentation
Model Details
Uses
Direct Use
The model can be directly applied to instruction following tasks, generating responses according to the provided context and instructions.
Downstream Use
It can be fine - tuned for additional downstream tasks such as code generation, dialogue systems, and other applications that demand natural language understanding and generation.
Out - of - Scope Use
The model is not suitable for tasks requiring understanding of context beyond the given instructions, like general world knowledge or domain - specific knowledge.
Training Details
Training Data
[PhiCo - D Dataset Card](https://huggingface.co/datasets/PhiCo - D)
Training Procedure
- Preprocessing: The data was tokenized using the T5 tokenizer.
- Training Hyperparameters: Training regime was fp16.
- Speeds, Sizes, Times:
- Number of training epochs: 5
- Total training time: 2 days
- Average time per batch: 1.5 seconds
Evaluation
Testing Data, Factors & Metrics
- Testing Data: [PhiCo - D Testing Data](https://huggingface.co/datasets/PhiCo - D)
- Factors: Diversity of contexts and instructions
- Metrics:
- BLEU - 4
- ROUGE - L
- METEOR
Results
Metric |
Score |
BLEU - 4 |
0.41 |
ROUGE - L |
0.52 |
METEOR |
0.45 |
Model Examination
PhiCo - D Model Interpretability
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
Property |
Details |
Hardware Type |
NVIDIA V100 |
Hours used |
48 |
Cloud Provider |
Google Cloud |
Compute Region |
us - central1 |
Carbon Emitted |
3200 grams of CO2eq |
Technical Specifications
Model Architecture and Objective
The acecalisto3/PhiCo-D-Instruck
model is based on the T5 - base model architecture with a seq2seq objective.
Compute Infrastructure
- Hardware: NVIDIA V100 with 16 GB GPU memory
- Software: PyTorch 1.11, Transformers 4.20, CUDA 11.3
Citation
BibTeX:
@misc{PhiCo-D,
author = {AceCalisto3},
title = {PhiCo-D-Instruck: A Fine-Tuned T5 Model for Instruction Following},
howpublished = {\url{https://huggingface.co/acecalisto3/PhiCo-D-Instruck}},
year = {2023},
note = {[License: Apache-2.0]},
}
APA:
AceCalisto3. (2023). PhiCo - D - Instruck: A Fine - Tuned T5 Model for Instruction Following. Retrieved from https://huggingface.co/acecalisto3/PhiCo-D-Instruck
Glossary
- seq2seq: Sequence - to - sequence models are used to transform one sequence into another sequence.
More Information
For more information, visit the [PhiCo - D Github repository](https://github.com/AceCalisto3/PhiCo - D).
Model Card Authors
AceCalisto3
Model Card Contact
For questions or concerns, please contact AceCalisto3 through their Hugging Face profile.
Bias, Risks, and Limitations
⚠️ Important Note
The model may exhibit biases inherited from the training data. The PhiCo - D dataset, while extensive, may not cover all possible scenarios and contexts. The model's responses are based on the given context and instructions and may not perform well if the context or instructions are unclear, ambiguous, or incomplete.
💡 Usage Tip
Users (both direct and downstream) should be aware of the risks, biases, and limitations of the model.