Platypus2-70B-instruct Open Source Large Model - Free Deployment to Support Instruction Following and Logical Reasoning

Platypus2 70B Instruct

Developed by garage-bAInd

Platypus2-70B-instruct is a large language model based on the LLaMA 2 architecture, created by merging models from garage-bAInd and upstageAI, focusing on instruction following and logical reasoning tasks.

Large Language Model

Transformers

English#STEM logical reasoning #Multi-task instruction fine-tuning #Academic research optimization

Downloads 1,332

Release Time : 8/4/2023

Model Overview

This model combines the strengths of Platypus2-70B and Llama-2-70b-instruct-v2, excelling particularly in STEM and logical reasoning tasks, suitable for scenarios requiring complex problem-solving.

Model Features

Powerful logical reasoning capability

Trained on STEM and logical foundation datasets, particularly adept at solving complex logical problems

Instruction optimization

Specially fine-tuned to better understand and follow user instructions

Model merging technology

Combines the strengths of two high-performance models (Platypus2-70B and Llama-2-70b-instruct)

Model Capabilities

Text generation

Instruction following

Logical reasoning

STEM problem solving

Knowledge Q&A

Use Cases

Education

STEM teaching assistance

Helps students understand and solve science, technology, engineering and mathematics problems

Achieved 71.84 points in the ARC challenge

Research

Academic research assistance

Assists researchers with literature review and knowledge integration

Achieved 70.48 points in the MMLU benchmark

🚀 Platypus2-70B-instruct

Platypus2-70B-instruct is a powerful language model that combines the strengths of garage-bAInd/Platypus2-70B and upstage/Llama-2-70b-instruct-v2. It offers high performance in various language tasks, especially those related to STEM and logic.

Platty

🚀 Quick Start

Prompt Template

### Instruction:

<prompt> (without the <>)

### Response:

✨ Features

Powerful Architecture: Based on the LLaMA 2 transformer architecture, providing strong language understanding and generation capabilities.
Diverse Training Data: Trained on STEM and logic based datasets, enabling it to handle complex technical and logical tasks.
Fine - Tuned: Instruction fine - tuned using LoRA for better performance in specific tasks.

📦 Installation

To reproduce the evaluation results, you need to install the LM Evaluation Harness:

# clone repository
git clone https://github.com/EleutherAI/lm-evaluation-harness.git
# change to repo directory
cd lm-evaluation-harness
# check out the correct commit
git checkout b281b0921b636bc36ad05c0b0b0763bd6dd43463
# install
pip install -e .

💻 Usage Examples

ARC

python main.py --model hf-causal-experimental --model_args pretrained=garage-bAInd/Platypus2-70B-instruct --tasks arc_challenge --batch_size 1 --no_cache --write_out --output_path results/Platypus2-70B-instruct/arc_challenge_25shot.json --device cuda --num_fewshot 25

HellaSwag

python main.py --model hf-causal-experimental --model_args pretrained=garage-bAInd/Platypus2-70B-instruct --tasks hellaswag --batch_size 1 --no_cache --write_out --output_path results/Platypus2-70B-instruct/hellaswag_10shot.json --device cuda --num_fewshot 10

MMLU

python main.py --model hf-causal-experimental --model_args pretrained=garage-bAInd/Platypus2-70B-instruct --tasks hendrycksTest-* --batch_size 1 --no_cache --write_out --output_path results/Platypus2-70B-instruct/mmlu_5shot.json --device cuda --num_fewshot 5

TruthfulQA

python main.py --model hf-causal-experimental --model_args pretrained=garage-bAInd/Platypus2-70B-instruct --tasks truthfulqa_mc --batch_size 1 --no_cache --write_out --output_path results/Platypus2-70B-instruct/truthfulqa_0shot.json --device cuda

📚 Documentation

Model Details

Property	Details
Trained by	Platypus2-70B trained by Cole Hunter & Ariel Lee; Llama-2-70b-instruct trained by upstageAI
Model Type	Platypus2-70B-instruct is an auto-regressive language model based on the LLaMA 2 transformer architecture.
Language(s)	English
License	Non-Commercial Creative Commons license (CC BY-NC-4.0)

Training Dataset

garage-bAInd/Platypus2-70B was trained using the STEM and logic based dataset garage-bAInd/Open-Platypus.

For more information, please refer to our paper and project webpage.

Training Procedure

garage-bAInd/Platypus2-70B was instruction fine - tuned using LoRA on 8 A100 80GB. For training details and inference instructions, please visit the Platypus GitHub repo.

Reproducing Evaluation Results

Each task was evaluated on a single A100 80GB GPU.

Limitations and bias

⚠️ Important Note

Llama 2 and fine - tuned variants are a new technology that carries risks with use. Testing conducted to date has been in English, and has not covered, nor could it cover all scenarios. For these reasons, as with all LLMs, Llama 2 and any fine - tuned variant's potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 2 variants, developers should perform safety testing and tuning tailored to their specific applications of the model.

Please see the Responsible Use Guide available at https://ai.meta.com/llama/responsible-use-guide/

🔧 Technical Details

garage-bAInd/Platypus2-70B was instruction fine - tuned using LoRA on 8 A100 80GB. For more details about training and inference, please refer to the Platypus GitHub repo.

📄 License

This model is released under the Non - Commercial Creative Commons license (CC BY-NC-4.0).

📄 Citations

@article{platypus2023,
    title={Platypus: Quick, Cheap, and Powerful Refinement of LLMs}, 
    author={Ariel N. Lee and Cole J. Hunter and Nataniel Ruiz},
    booktitle={arXiv preprint arxiv:2308.07317},
    year={2023}
}

@misc{touvron2023llama,
    title={Llama 2: Open Foundation and Fine-Tuned Chat Models}, 
    author={Hugo Touvron and Louis Martin and Kevin Stone and Peter Albert and Amjad Almahairi and Yasmine Babaei and Nikolay Bashlykov},
    year={2023},
    eprint={2307.09288},
    archivePrefix={arXiv},
}

@inproceedings{
    hu2022lora,
    title={Lo{RA}: Low-Rank Adaptation of Large Language Models},
    author={Edward J Hu and Yelong Shen and Phillip Wallis and Zeyuan Allen-Zhu and Yuanzhi Li and Shean Wang and Lu Wang and Weizhu Chen},
    booktitle={International Conference on Learning Representations},
    year={2022},
    url={https://openreview.net/forum?id=nZeVKeeFYf9}
}

📊 Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	66.89
ARC (25-shot)	71.84
HellaSwag (10-shot)	87.94
MMLU (5-shot)	70.48
TruthfulQA (0-shot)	62.26
Winogrande (5-shot)	82.72
GSM8K (5-shot)	40.56
DROP (3-shot)	52.41

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご