Lamini T5 61M

Developed by MBZUAI

LaMini-T5-61M is an instruction-following model based on the T5-small architecture, fine-tuned on the LaMini-instruction dataset with a parameter scale of 61M.

Large Language Model

Transformers

English#Instruction Fine-tuning #Lightweight Model #Multitasking

Downloads 1,287

Release Time : 4/11/2023

Model Overview

This model is part of the LaMini-LM series, specifically designed to respond to natural language instructions and suitable for various text generation tasks.

Model Features

Instruction Fine-tuning

Fine-tuned on the LaMini-instruction dataset containing 2.58 million instructions, optimizing instruction-following capabilities.

Lightweight Architecture

Based on the T5-small architecture with only 61M parameters, suitable for deployment in resource-limited environments.

Diverse Applications

Capable of handling various types of natural language instructions and queries.

Model Capabilities

Text Generation

Instruction Response

Question Answering System

Use Cases

Intelligent Assistant

Health Advice Generation

Generates healthy lifestyle suggestions based on user queries

Example question: 'How can I become healthier?'

Travel Recommendations

Attraction Review Generation

Generates travel recommendations and reviews based on location names

Example input: 'Please tell me your thoughts on Barcelona, Spain'

🚀 LaMini-T5-61M

LaMini-T5-61M is a fine - tuned model from the LaMini - LM series, designed to respond to natural - language instructions.

🚀 Quick Start

This model is one of our LaMini - LM series in paper "[LaMini - LM: A Diverse Herd of Distilled Models from Large - Scale Instructions](https://github.com/mbzuai - nlp/lamini - lm)". It's a fine - tuned version of [t5 - small](https://huggingface.co/t5 - small) on [LaMini - instruction dataset](https://huggingface.co/datasets/MBZUAI/LaMini - instruction) that contains 2.58M samples for instruction fine - tuning. For more information about our dataset, please refer to our [project repository](https://github.com/mbzuai - nlp/lamini - lm/).

You can view other models of LaMini - LM series as follows. Models with ✩ are those with the best overall performance given their size/architecture, hence we recommend using them. More details can be seen in our paper.

Base model	LaMini - LM series (#parameters)
T5	LaMini - T5 - 61M
T5	LaMini - T5 - 223M
T5	LaMini - T5 - 738M
Flan - T5	LaMini - Flan - T5 - 77M✩
Flan - T5	LaMini - Flan - T5 - 248M✩
Flan - T5	LaMini - Flan - T5 - 783M✩
Cerebras - GPT	LaMini - Cerebras - 111M
Cerebras - GPT	LaMini - Cerebras - 256M
Cerebras - GPT	LaMini - Cerebras - 590M
Cerebras - GPT	LaMini - Cerebras - 1.3B
GPT - 2	LaMini - GPT - 124M✩
GPT - 2	LaMini - GPT - 774M✩
GPT - 2	LaMini - GPT - 1.5B✩
GPT - Neo	LaMini - Neo - 125M
GPT - Neo	LaMini - Neo - 1.3B
GPT - J	coming soon
LLaMA	coming soon

✨ Features

Instruction - based Response: Designed to respond to human instructions written in natural language.

💻 Usage Examples

Basic Usage

# pip install -q transformers
from transformers import pipeline

checkpoint = "{model_name}"

model = pipeline('text2text-generation', model = checkpoint)

input_prompt = 'Please let me know your thoughts on the given place and why you think it deserves to be visited: \n"Barcelona, Spain"'
generated_text = model(input_prompt, max_length=512, do_sample=True)[0]['generated_text']

print("Response", generated_text)

🔧 Technical Details

Training Procedure

We initialize with [t5 - small](https://huggingface.co/t5 - small) and fine - tune it on our [LaMini - instruction dataset](https://huggingface.co/datasets/MBZUAI/LaMini - instruction). Its total number of parameters is 61M.

Training Hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 64
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 512
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e - 08
lr_scheduler_type: linear
num_epochs: 5

📚 Documentation

We conducted two sets of evaluations: automatic evaluation on downstream NLP tasks and human evaluation on user - oriented instructions. For more detail, please refer to our paper.

📄 License

This model is licensed under CC By NC 4.0.

Citation

@article{lamini-lm,
  author       = {Minghao Wu and
                  Abdul Waheed and
                  Chiyu Zhang and
                  Muhammad Abdul-Mageed and
                  Alham Fikri Aji
                  },
  title        = {LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions},
  journal      = {CoRR},
  volume       = {abs/2304.14402},
  year         = {2023},
  url          = {https://arxiv.org/abs/2304.14402},
  eprinttype   = {arXiv},
  eprint       = {2304.14402}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご