SOLAR-0-70b-16bit Open-source Large Language Model - Free Support for Long Text Processing, Excellent Performance in Rankings

SOLAR 0 70b 16bit

Developed by upstage

A large language model fine-tuned by Upstage based on LLaMA-2, performing excellently on the HuggingFace Open LLM Leaderboard with support for long text processing

Large Language Model

Transformers

English#70B large parameter scale #Ultra-long text processing #Instruction fine-tuning optimization

Downloads 2,943

Release Time : 7/30/2023

Model Overview

A 70B-parameter large language model based on the LLaMA-2 architecture, optimized through instruction fine-tuning, excelling in understanding and generating natural language text

Model Features

Long text processing capability

Supports processing of ultra-long text inputs exceeding 10k tokens through rope_scaling technology

High-performance

Achieved leading scores on the HuggingFace Open LLM Leaderboard

Instruction optimization

Fine-tuned with Orca and Alpaca-style datasets, excels at following instructions

Model Capabilities

Text generation

Instruction understanding

Q&A systems

Long text processing

Use Cases

Intelligent assistant

Q&A system

Answering various user questions

Capable of providing accurate and detailed answers

Content generation

Generating coherent text content based on prompts

Produces fluent and contextually appropriate text

Enterprise applications

Document processing

Processing and analyzing long document content

Capable of understanding and summarizing key information from long documents

🚀 SOLAR-0-70b-16bit Model

Solar, a new bot created by Upstage, is a fine - tuned Llama 2 model and ranks high on the HuggingFace Open LLM leaderboard, demonstrating the power of open - source progress.

🚀 Quick Start

Solar, a new bot developed by Upstage, is now accessible on Poe. As a top - ranked model on the HuggingFace Open LLM leaderboard and a fine - tuned version of Llama 2, Solar showcases the advancements enabled by open source. Try it now at https://poe.com/Solar-0-70b

✨ Features

Capable of handling up to 10k+ input tokens due to the rope_scaling option.
Tested on A100 80GB hardware.

📦 Installation

No specific installation steps are provided in the original README.

💻 Usage Examples

Basic Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

tokenizer = AutoTokenizer.from_pretrained("upstage/Llama-2-70b-instruct-v2")
model = AutoModelForCausalLM.from_pretrained(
    "upstage/Llama-2-70b-instruct-v2",
    device_map="auto",
    torch_dtype=torch.float16,
    load_in_8bit=True,
    rope_scaling={"type": "dynamic", "factor": 2} # allows handling of longer inputs
)

prompt = "### User:\nThomas is healthy, but he has to go to the hospital. What could be the reasons?\n\n### Assistant:\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
del inputs["token_type_ids"]
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

output = model.generate(**inputs, streamer=streamer, use_cache=True, max_new_tokens=float('inf'))
output_text = tokenizer.decode(output[0], skip_special_tokens=True)

📚 Documentation

Model Details

Property	Details
Developed by	Upstage
Backbone Model	LLaMA - 2
Language(s)	English
Library	HuggingFace Transformers
License	Fine - tuned checkpoints is licensed under the Non - Commercial Creative Commons license ([CC BY - NC - 4.0](https://creativecommons.org/licenses/by - nc/4.0/))
Where to send comments	Instructions on how to provide feedback or comments on a model can be found by opening an issue in the Hugging Face community's model repository
Contact	For questions and comments about the model, please email contact@upstage.ai

Dataset Details

Used Datasets

Orca - style dataset
Alpaca - style dataset
No other dataset was used except for the dataset mentioned above
No benchmark test set or the training set are used

Prompt Template

### System:
{System}

### User:
{User}

### Assistant:
{Assistant}

Hardware and Software

Property	Details
Hardware	We utilized an A100x8 * 4 for training our model
Training Factors	We fine - tuned this model using a combination of the DeepSpeed library and the HuggingFace Trainer / HuggingFace Accelerate

Evaluation Results

Overview

We conducted a performance evaluation following the tasks being evaluated on the Open LLM Leaderboard. We evaluated our model on four benchmark datasets, which include ARC - Challenge, HellaSwag, MMLU, and TruthfulQA. We used the [lm - evaluation - harness repository](https://github.com/EleutherAI/lm - evaluation - harness), specifically commit [b281b0921b636bc36ad05c0b0b0763bd6dd43463](https://github.com/EleutherAI/lm - evaluation - harness/tree/b281b0921b636bc36ad05c0b0b0763bd6dd43463).
We used [MT - bench](https://github.com/lm - sys/FastChat/tree/main/fastchat/llm_judge), a set of challenging multi - turn open - ended questions, to evaluate the models.

Main Results

Model	H4(Avg)	ARC	HellaSwag	MMLU	TruthfulQA	MT_Bench
Llama - 2 - 70b - instruct - v2(Ours, Open LLM Leaderboard)	73	71.1	87.9	70.6	62.2	7.44063
Llama - 2 - 70b - instruct (Ours, Open LLM Leaderboard)	72.3	70.9	87.5	69.8	61	7.24375
llama - 65b - instruct (Ours, Open LLM Leaderboard)	69.4	67.6	86.5	64.9	58.8
Llama - 2 - 70b - hf	67.3	67.3	87.3	69.8	44.9
llama - 30b - instruct - 2048 (Ours, Open LLM Leaderboard)	67.0	64.9	84.9	61.9	56.3
llama - 30b - instruct (Ours, Open LLM Leaderboard)	65.2	62.5	86.2	59.4	52.8
llama - 65b	64.2	63.5	86.1	63.9	43.4
falcon - 40b - instruct	63.4	61.6	84.3	55.4	52.5

Scripts for H4 Score Reproduction

Prepare evaluation environments:

# clone the repository
git clone https://github.com/EleutherAI/lm-evaluation-harness.git
# check out the specific commit
git checkout b281b0921b636bc36ad05c0b0b0763bd6dd43463
# change to the repository directory
cd lm-evaluation-harness

📄 License

The fine - tuned checkpoints of this model are licensed under the Non - Commercial Creative Commons license ([CC BY - NC - 4.0](https://creativecommons.org/licenses/by - nc/4.0/)).

📞 Contact Us

About Upstage

Upstage is a company specialized in Large Language Models (LLMs) and AI. We will help you build private LLMs and related applications. If you have a dataset to build domain - specific LLMs or make LLM applications, please contact us at click here to contact.
As of August 1st, our 70B model has reached the top spot in openLLM rankings, marking itself as the current leading performer globally.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご