Lamini GPT 1.5B

Developed by MBZUAI

LaMini-GPT-1.5B is a large language model fine-tuned based on the GPT-2-xl architecture, belonging to the LaMini-LM series, focusing on instruction-following tasks

Large Language Model

Transformers

English#Instruction-tuned model #Natural language generation #Diverse distillation

Downloads 365

Release Time : 4/16/2023

Model Overview

This model is a fine-tuned version of GPT-2-xl on the LaMini-instruction dataset containing 2.58 million instructions, excelling at responding to natural language instructions

Model Features

Instruction tuning optimization

Fine-tuned on 2.58 million diverse instruction data, significantly improving instruction understanding and execution capabilities

Efficient inference

1.5B parameter scale achieves relatively efficient inference while maintaining good performance

Diverse task support

Capable of handling various natural language tasks such as Q&A, suggestion generation, and content creation

Model Capabilities

Natural language understanding

Instruction following

Text generation

Question answering system

Content creation

Use Cases

Intelligent assistant

Health advice generation

Provides personalized suggestions based on users' health needs

Can generate structured healthy lifestyle recommendations

Educational applications

Learning guidance

Answers student questions and provides learning resource suggestions

Can generate educational content and learning path recommendations

🚀 LaMini-GPT-1.5B

LaMini-GPT-1.5B is a fine - tuned text - generation model from the LaMini - LM series, offering high - quality responses to natural language instructions.

🚀 Quick Start

This model is one of our LaMini - LM model series introduced in the paper "[LaMini - LM: A Diverse Herd of Distilled Models from Large - Scale Instructions](https://github.com/mbzuai - nlp/lamini - lm)". It is a fine - tuned version of [gpt2 - xl](https://huggingface.co/gpt2 - xl) on the [LaMini - instruction dataset](https://huggingface.co/datasets/MBZUAI/LaMini - instruction), which contains 2.58M samples for instruction fine - tuning. For more information about our dataset, please refer to our [project repository](https://github.com/mbzuai - nlp/lamini - lm/).

You can view other models of the LaMini - LM series in the following table. Models marked with ✩ have the best overall performance given their size/architecture, so we recommend using them. More details can be found in our paper.

Base model	LaMini - LM series (#parameters)
T5	LaMini - T5 - 61M	LaMini - T5 - 223M	LaMini - T5 - 738M
Flan - T5	LaMini - Flan - T5 - 77M✩	LaMini - Flan - T5 - 248M✩	LaMini - Flan - T5 - 783M✩
Cerebras - GPT	LaMini - Cerebras - 111M	LaMini - Cerebras - 256M	LaMini - Cerebras - 590M	LaMini - Cerebras - 1.3B
GPT - 2	LaMini - GPT - 124M✩	LaMini - GPT - 774M✩	LaMini - GPT - 1.5B✩
GPT - Neo	LaMini - Neo - 125M	LaMini - Neo - 1.3B
GPT - J	coming soon
LLaMA	coming soon

✨ Features

Instruction - following: Designed to respond to human instructions written in natural language.
Fine - tuned: Fine - tuned on a large - scale instruction dataset for better performance.

💻 Usage Examples

Basic Usage

# pip install -q transformers
from transformers import pipeline

checkpoint = "{model_name}" 

model = pipeline('text-generation', model = checkpoint)

instruction = 'Please let me know your thoughts on the given place and why you think it deserves to be visited: \n"Barcelona, Spain"'

input_prompt = f"Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:"

generated_text = model(input_prompt, max_length=512, do_sample=True)[0]['generated_text']

print("Response", generated_text)

Advanced Usage

Since this decoder - only model is fine - tuned with wrapper text, using the same wrapper text can achieve the best performance. You can customize the instruction and input_prompt according to your specific needs.

🔧 Technical Details

We initialize with [gpt2 - xl](https://huggingface.co/gpt2 - xl) and fine - tune it on our [LaMini - instruction dataset](https://huggingface.co/datasets/MBZUAI/LaMini - instruction). Its total number of parameters is 1.5B.

Training Hyperparameters

[The original document doesn't provide specific hyperparameters, so this part is skipped as per the rules.]

📚 Documentation

We conducted two sets of evaluations: automatic evaluation on downstream NLP tasks and human evaluation on user - oriented instructions. For more detail, please refer to our paper.

📄 License

This model is released under the CC By NC 4.0 license.

Citation

@article{lamini-lm,
  author       = {Minghao Wu and
                  Abdul Waheed and
                  Chiyu Zhang and
                  Muhammad Abdul-Mageed and
                  Alham Fikri Aji
                  },
  title        = {LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions},
  journal      = {CoRR},
  volume       = {abs/2304.14402},
  year         = {2023},
  url          = {https://arxiv.org/abs/2304.14402},
  eprinttype   = {arXiv},
  eprint       = {2304.14402}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご