LaMini-T5-738M Open-Source Instruction-Tuned Model - Freely Empower Various Instruction Task Processing

Lamini T5 738M

Developed by MBZUAI

LaMini-T5-738M is an instruction-tuned model based on t5-large, fine-tuned on the LaMini-instruction dataset, with 738M parameters, and is one of the LaMini-LM series models.

Large Language Model

Transformers

English#Instruction-tuned model #Large-scale language understanding #Natural language generation

Downloads 2,966

Release Time : 4/17/2023

Model Overview

This model is primarily used to respond to human instructions written in natural language, serving as a text generation model fine-tuned on a large-scale instruction dataset.

Model Features

Large-scale instruction tuning

Fine-tuned on the LaMini-instruction dataset containing 2.58 million instructions.

Diverse model series

Part of the LaMini-LM series, offering various model sizes.

Efficient distillation

Extracts knowledge from large-scale models using knowledge distillation techniques.

Model Capabilities

Natural language understanding

Instruction response

Text generation

Use Cases

Intelligent assistant

Health advice

Responds to inquiries about healthy lifestyles.

Content generation

Travel recommendations

Generates reviews and reasons for recommending travel destinations.

🚀 LaMini-T5-738M

LaMini-T5-738M is a fine - tuned model in the LaMini - LM series, designed for text - to - text generation tasks, offering high - quality responses to natural language instructions.

🚀 Quick Start

This model is one of our LaMini - LM model series presented in the paper "[LaMini - LM: A Diverse Herd of Distilled Models from Large - Scale Instructions](https://github.com/mbzuai - nlp/lamini - lm)". It is a fine - tuned version of [t5 - large](https://huggingface.co/t5 - large) on the [LaMini - instruction dataset](https://huggingface.co/datasets/MBZUAI/LaMini - instruction), which contains 2.58M samples for instruction fine - tuning. For more details about our dataset, please visit our [project repository](https://github.com/mbzuai - nlp/lamini - lm/).

You can view other models in the LaMini - LM series below. Models marked with ✩ have the best overall performance given their size/architecture, and we recommend using them. More details can be found in our paper.

Base model	LaMini - LM series (#parameters)
T5	[LaMini - T5 - 61M](https://huggingface.co/MBZUAI/lamini - t5 - 61m)	[LaMini - T5 - 223M](https://huggingface.co/MBZUAI/lamini - t5 - 223m)	[LaMini - T5 - 738M](https://huggingface.co/MBZUAI/lamini - t5 - 738m)
Flan - T5	[LaMini - Flan - T5 - 77M](https://huggingface.co/MBZUAI/lamini - flan - t5 - 77m)✩	[LaMini - Flan - T5 - 248M](https://huggingface.co/MBZUAI/lamini - flan - t5 - 248m)✩	[LaMini - Flan - T5 - 783M](https://huggingface.co/MBZUAI/lamini - flan - t5 - 783m)✩
Cerebras - GPT	[LaMini - Cerebras - 111M](https://huggingface.co/MBZUAI/lamini - cerebras - 111m)	[LaMini - Cerebras - 256M](https://huggingface.co/MBZUAI/lamini - cerebras - 256m)	[LaMini - Cerebras - 590M](https://huggingface.co/MBZUAI/lamini - cerebras - 590m)	[LaMini - Cerebras - 1.3B](https://huggingface.co/MBZUAI/lamini - cerebras - 1.3b)
GPT - 2	[LaMini - GPT - 124M](https://huggingface.co/MBZUAI/lamini - gpt - 124m)✩	[LaMini - GPT - 774M](https://huggingface.co/MBZUAI/lamini - gpt - 774m)✩	[LaMini - GPT - 1.5B](https://huggingface.co/MBZUAI/lamini - gpt - 1.5b)✩
GPT - Neo	[LaMini - Neo - 125M](https://huggingface.co/MBZUAI/lamini - neo - 125m)	[LaMini - Neo - 1.3B](https://huggingface.co/MBZUAI/lamini - neo - 1.3b)
GPT - J	coming soon
LLaMA	coming soon

✨ Features

Instruction - following: Designed to effectively respond to human instructions written in natural language.
Fine - tuned: Based on [t5 - large](https://huggingface.co/t5 - large) and fine - tuned on a large - scale instruction dataset.

💻 Usage Examples

Basic Usage

# pip install -q transformers
from transformers import pipeline

checkpoint = "{model_name}"

model = pipeline('text2text-generation', model = checkpoint)

input_prompt = 'Please let me know your thoughts on the given place and why you think it deserves to be visited: \n"Barcelona, Spain"'
generated_text = model(input_prompt, max_length=512, do_sample=True)[0]['generated_text']

print("Response", generated_text)

🔧 Technical Details

Training Procedure

We initialize with [t5 - large](https://huggingface.co/t5 - large) and fine - tune it on our [LaMini - instruction dataset](https://huggingface.co/datasets/MBZUAI/LaMini - instruction). The model has a total of 738M parameters.

Training Hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 64
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 512
optimizer: Adam with betas=(0.9,0.999) and epsilon = 1e - 08
lr_scheduler_type: linear
num_epochs: 5

📚 Documentation

Evaluation

We conducted two sets of evaluations: automatic evaluation on downstream NLP tasks and human evaluation on user - oriented instructions. For more details, please refer to our paper.

Limitations

More information needed

📄 License

This model is released under the [CC By NC 4.0](https://creativecommons.org/licenses/by - nc/4.0/) license.

📖 Citation

@article{lamini-lm,
  author       = {Minghao Wu and
                  Abdul Waheed and
                  Chiyu Zhang and
                  Muhammad Abdul-Mageed and
                  Alham Fikri Aji
                  },
  title        = {LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions},
  journal      = {CoRR},
  volume       = {abs/2304.14402},
  year         = {2023},
  url          = {https://arxiv.org/abs/2304.14402},
  eprinttype   = {arXiv},
  eprint       = {2304.14402}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご