LaMini-Flan-T5-248M Open-Source Language Model - Free Deployment and Efficient Response to Natural Language Instructions

Lamini Flan T5 248M

Developed by MBZUAI

LaMini-Flan-T5-248M is a 248M parameter model fine-tuned on the LaMini-instruction dataset based on google/flan-t5-base, belonging to the LaMini-LM series, specifically designed for responding to natural language instructions.

Large Language Model

Transformers

English#Instruction Fine-tuning #Natural Language Generation #Small-scale Model

Downloads 5,652

Release Time : 4/10/2023

Model Overview

This model is part of the LaMini-LM series, optimized through large-scale instruction fine-tuning and suitable for various natural language processing tasks.

Model Features

Large-scale Instruction Fine-tuning

Fine-tuned on the LaMini-instruction dataset containing 2.58 million instructions, optimizing instruction response capabilities.

Efficient Distillation

Extracts knowledge from smaller models through knowledge distillation techniques, reducing computational resource requirements while maintaining performance.

Diverse Applications

Suitable for various natural language processing tasks, including question answering and text generation.

Model Capabilities

Natural Language Understanding

Instruction Response

Text Generation

Question Answering System

Use Cases

Intelligent Assistant

Health Advice Generation

Generates health lifestyle suggestions based on user queries

Example question: 'How can I become healthier?'

Travel Recommendations

Attraction Review Generation

Generates travel reviews and recommendation reasons based on location names

Example input: 'Please tell me your thoughts on Barcelona, Spain'

🚀 LaMini-Flan-T5-248M

LaMini-Flan-T5-248M is a fine - tuned model from the LaMini - LM series, designed to handle natural language instructions effectively.

This model belongs to our LaMini - LM model series, as presented in the paper "LaMini - LM: A Diverse Herd of Distilled Models from Large - Scale Instructions". It is a fine - tuned version of [google/flan - t5 - base](https://huggingface.co/google/flan - t5 - base) on the [LaMini - instruction dataset](https://huggingface.co/datasets/MBZUAI/LaMini - instruction), which contains 2.58M samples for instruction fine - tuning. For more details about our dataset, please visit our project repository.

You can explore other models in the LaMini - LM series below. Models marked with ✩ offer the best overall performance given their size/architecture, so we recommend using them. More details are available in our paper.

Base model	LaMini - LM series (#parameters)
T5	[LaMini - T5 - 61M](https://huggingface.co/MBZUAI/lamini - t5 - 61m)	[LaMini - T5 - 223M](https://huggingface.co/MBZUAI/lamini - t5 - 223m)	[LaMini - T5 - 738M](https://huggingface.co/MBZUAI/lamini - t5 - 738m)
Flan - T5	[LaMini - Flan - T5 - 77M](https://huggingface.co/MBZUAI/lamini - flan - t5 - 77m)✩	[LaMini - Flan - T5 - 248M](https://huggingface.co/MBZUAI/lamini - flan - t5 - 248m)✩	[LaMini - Flan - T5 - 783M](https://huggingface.co/MBZUAI/lamini - flan - t5 - 783m)✩
Cerebras - GPT	[LaMini - Cerebras - 111M](https://huggingface.co/MBZUAI/lamini - cerebras - 111m)	[LaMini - Cerebras - 256M](https://huggingface.co/MBZUAI/lamini - cerebras - 256m)	[LaMini - Cerebras - 590M](https://huggingface.co/MBZUAI/lamini - cerebras - 590m)	[LaMini - Cerebras - 1.3B](https://huggingface.co/MBZUAI/lamini - cerebras - 1.3b)
GPT - 2	[LaMini - GPT - 124M](https://huggingface.co/MBZUAI/lamini - gpt - 124m)✩	[LaMini - GPT - 774M](https://huggingface.co/MBZUAI/lamini - gpt - 774m)✩	[LaMini - GPT - 1.5B](https://huggingface.co/MBZUAI/lamini - gpt - 1.5b)✩
GPT - Neo	[LaMini - Neo - 125M](https://huggingface.co/MBZUAI/lamini - neo - 125m)	[LaMini - Neo - 1.3B](https://huggingface.co/MBZUAI/lamini - neo - 1.3b)
GPT - J	coming soon
LLaMA	coming soon

🚀 Quick Start

Intended use

We recommend using the model to respond to human instructions written in natural language.

Usage Examples

Basic Usage

# pip install -q transformers
from transformers import pipeline

checkpoint = "{model_name}"

model = pipeline('text2text-generation', model = checkpoint)

input_prompt = 'Please let me know your thoughts on the given place and why you think it deserves to be visited: \n"Barcelona, Spain"'
generated_text = model(input_prompt, max_length=512, do_sample=True)[0]['generated_text']

print("Response", generated_text)

📚 Documentation

Training Procedure

We initialize with [google/flan - t5 - base](https://huggingface.co/google/flan - t5 - base) and fine - tune it on our [LaMini - instruction dataset](https://huggingface.co/datasets/MBZUAI/LaMini - instruction). Its total number of parameters is 248M.

Training Hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 128
eval_batch_size: 64
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 512
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e - 08
lr_scheduler_type: linear
num_epochs: 5

Evaluation

We conducted two sets of evaluations: automatic evaluation on downstream NLP tasks and human evaluation on user - oriented instructions. For more detail, please refer to our paper.

Limitations

More information needed

📄 License

This model is licensed under CC - BY - NC - 4.0.

📖 Citation

@article{lamini-lm,
  author       = {Minghao Wu and
                  Abdul Waheed and
                  Chiyu Zhang and
                  Muhammad Abdul - Mageed and
                  Alham Fikri Aji
                  },
  title        = {LaMini - LM: A Diverse Herd of Distilled Models from Large - Scale Instructions},
  journal      = {CoRR},
  volume       = {abs/2304.14402},
  year         = {2023},
  url          = {https://arxiv.org/abs/2304.14402},
  eprinttype   = {arXiv},
  eprint       = {2304.14402}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご