đ Solar Pro Preview: The most intelligent LLM on a single GPU
Solar Pro Preview is an advanced large language model with 22 billion parameters, designed to fit into a single GPU and offering superior performance.
đ Quick Start
Solar Pro Preview is an instruction - tuned large language model (LLM) with 22 billion parameters. It's designed to run on a single GPU with 80GB of VRAM. This pre - release version shows great potential, though it has limitations in language coverage and a 4K maximum context length. The official Solar Pro will be released in November 2024 with extended language support and longer context windows.
⨠Features
- Single - GPU Compatibility: It can fit into a single GPU, making it accessible for users with standard GPU setups.
- High Performance: Outperforms many LLMs with less than 30 billion parameters and rivals models over three times its size, like Llama 3.1 with 70 billion parameters.
- Enhanced Training: Developed using an enhanced depth up - scaling method, trained on a carefully curated dataset, which significantly improves performance on benchmarks like MMLU - Pro and IFEval.
đĻ Installation
To use Solar Pro Preview, you need to install the necessary libraries. Here are the installation commands:
đģ Usage Examples
Basic Usage
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("upstage/solar-pro-preview-instruct")
model = AutoModelForCausalLM.from_pretrained(
"upstage/solar-pro-preview-instruct",
device_map="cuda",
torch_dtype="auto",
trust_remote_code=True,
)
messages = [
{"role": "user", "content": "Please, introduce yourself."},
]
prompt = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
outputs = model.generate(prompt, max_new_tokens=512)
print(tokenizer.decode(outputs[0]))
Advanced Usage
Solar Pro Preview is also available as an API in Upstage Console. For more information and other easy - to - use methods, visit our blog page.
đ Documentation
Chat Template
As an instruction - tuned model, Solar Pro Preview uses the ChatML template for optimal performance in conversational and instruction - following tasks. Here is an example:
<|im_start|>user
Please, introduce yourself.<|im_end|>
<|im_start|>assistant
Note that system prompts are not currently supported in Solar Pro Preview. This feature will be available in the official release.
đ§ Technical Details
Solar Pro Preview is developed using an enhanced version of the previous depth up - scaling method. It scales a Phi - 3 - medium model with 14 billion parameters to 22 billion parameters. The carefully curated training strategy and dataset have significantly enhanced performance, especially on the MMLU - Pro and IFEval benchmarks.
đ License
Solar Pro Preview is released under the MIT License.
đ Evaluation
Solar Pro Preview is evaluated over a variety of benchmarks. The following table shows the comparison with other models:
Property |
Solar - pro - preview |
Phi - 3 - medium - 4K - instruct |
Phi - 3.5 - MoE - instruct |
Gemma 2 27B IT |
Llama - 3.1 - 8B - instruct |
Llama - 3.1 - 70B - instruct |
Release Date |
2024.09.08 |
2024.05.02 |
2024.08.20 |
2024.06.25 |
2024.06.18 |
2024.06.16 |
Model size |
22B |
14B |
41.9B (6.6B) |
27B |
8B |
70B |
License |
MIT |
MIT |
MIT |
gemma |
[llama3.1](https://huggingface.co/meta - llama/Meta - Llama - 3.1 - 8B/blob/main/LICENSE) |
[llama3.1](https://huggingface.co/meta - llama/Meta - Llama - 3.1 - 8B/blob/main/LICENSE) |
MMLU |
79.14 |
78.02 |
78.66 |
76.13 |
68.25 |
82.09 |
MMLU Pro |
52.11 |
47.51 |
46.99 |
45.68 |
37.88 |
53.01 |
IFEval |
84.37 |
64.37 |
69.15 |
75.36 |
77.40 |
84.13 |
ARC - C |
68.86 |
66.55 |
68.34 |
74.06 |
60.24 |
70.39 |
GPQA |
36.38 |
35.78 |
34.38 |
36.38 |
35.26 |
41.06 |
HellaSwag |
86.36 |
85.68 |
85.97 |
86.02 |
80.08 |
86.42 |
EQBench |
77.91 |
76.78 |
77.22 |
80.32 |
65.80 |
82.52 |
BigBench Hard |
67.31 |
63.09 |
62.58 |
64.88 |
51.06 |
69.54 |
MUSR |
45.85 |
42.28 |
46.79 |
45.67 |
29.68 |
47.22 |
GSM8K |
89.69 |
84.76 |
82.26 |
62.85 |
75.97 |
92.12 |
MBPP |
61.59 |
60.27 |
N/A (*) |
63.08 |
52.20 |
65.51 |
(*) Since the model tends to generate a chat template, the score can't be accurately determined.
Evaluation Protocol
The following table lists the evaluation tools and settings used:
Property |
Evaluation setting |
Metric |
Evaluation tool |
MMLU |
5 - shot |
macro_avg / acc |
[lm - eval - harness](https://github.com/EleutherAI/lm - evaluation - harness/tree/928e8bb6f50d1e93ef5d0bcaa81f8c5fd9a6f4d8) #928e8bb |
MMLU Pro |
5 - shot |
macro_avg / acc |
[lm - eval - harness](https://github.com/EleutherAI/lm - evaluation - harness/tree/928e8bb6f50d1e93ef5d0bcaa81f8c5fd9a6f4d8) #928e8bb |
IFEval |
0 - shot, chat_template |
mean of prompt_level_strict_acc and instruction_level_strict_acc |
[lm - eval - harness](https://github.com/EleutherAI/lm - evaluation - harness/tree/928e8bb6f50d1e93ef5d0bcaa81f8c5fd9a6f4d8) #928e8bb |
ARC - C |
25 - shot |
acc_norm |
[lm - eval - harness](https://github.com/EleutherAI/lm - evaluation - harness/tree/928e8bb6f50d1e93ef5d0bcaa81f8c5fd9a6f4d8) #928e8bb |
GPQA |
0 - shot |
acc_norm |
[lm - eval - harness](https://github.com/EleutherAI/lm - evaluation - harness/tree/928e8bb6f50d1e93ef5d0bcaa81f8c5fd9a6f4d8) #928e8bb |
HellaSwag |
10 - shot |
acc_norm |
[lm - eval - harness](https://github.com/EleutherAI/lm - evaluation - harness/tree/928e8bb6f50d1e93ef5d0bcaa81f8c5fd9a6f4d8) #928e8bb |
EQBench |
0 - shot, chat_template |
eqbench score |
[lm - eval - harness](https://github.com/EleutherAI/lm - evaluation - harness/tree/928e8bb6f50d1e93ef5d0bcaa81f8c5fd9a6f4d8) #928e8bb |
BigBench Hard |
3 - shot |
macro_avg / acc_norm |
[lm - eval - harness](https://github.com/EleutherAI/lm - evaluation - harness/tree/928e8bb6f50d1e93ef5d0bcaa81f8c5fd9a6f4d8) #928e8bb |
MUSR |
0 - shot |
macro_avg / acc_norm |
[lm - eval - harness](https://github.com/EleutherAI/lm - evaluation - harness/tree/928e8bb6f50d1e93ef5d0bcaa81f8c5fd9a6f4d8) #928e8bb |
GSM8K |
8 - shot, CoT |
acc, exact_match & strict_extract |
[lm - eval - harness](https://github.com/EleutherAI/lm - evaluation - harness/tree/928e8bb6f50d1e93ef5d0bcaa81f8c5fd9a6f4d8) #928e8bb |
MBPP |
0 - shot |
pass@1 |
[bigcode - evaluation - harness](https://github.com/bigcode - project/bigcode - evaluation - harness/tree/0f3e95f0806e78a4f432056cdb1be93604a51d69) #0f3e95f |
The results may vary slightly for different batch sizes and experimental environments such as GPU type.
đ Contact us
For any questions and suggestions regarding the model, please visit the discussion board.
Learn more:
Also try out: