🚀 Ziya-LLaMA-13B-v1
Ziya-LLaMA-13B-v1 is a large writing model based on LLaMa, which has been enhanced for better performance in various writing tasks.
✨ Features
Ziya Series Models
Brief Introduction
Ziya-Writing-LLaMa-13B-v1 is a 13-billion parameter instruction fine-tuned model based on LLaMa, which has been enhanced for better performance in writing tasks. It is a large model that focuses on writing. Ziya-Writing-LLaMa-13B-v1 can handle several types of writing tasks, including official reports, speeches, creative copywriting, and more.
For more details, please refer to our official account article:
Ziya Large Model Series | The writing model ziya-writing is open-sourced! It's ready to use out of the box. Come and claim your exclusive writing assistant!
📦 Installation
Software Dependencies
pip install torch==1.12.1 tokenizers==0.13.3 git+https://github.com/huggingface/transformers
📚 Documentation
Model Taxonomy
Property |
Details |
Demand |
Writing |
Task |
AGI Model |
Series |
Ziya |
Model |
LLaMA |
Parameter |
13B |
Extra |
English&Chinese |
Model Information
Supervised finetuning
We collected and cleaned a large amount of real human writing data from the internet and used GPT-3.5 to generate corresponding writing instructions which have undergone extremely strict manual verification.
Based on this, we used a reward model and certain cleaning logic to carefully select more challenging writing instructions, eliminating simple data, and ensuring the diversity of instructions.
We used the evol-instruct method to generate about 300,000 high-quality general instruction data. We mixed general instruction data and writing instruction data, which made ziya-writing not only have good intention understanding ability, but also can generate excellent responses.
Human-Feedback training
In our experiment, we found that by using a small amount of high-quality human-annotated writing ranking data and training the model with reinforcement learning, we could effectively improve the writing performance of the model.
To further improve the performance of the model, enabling it to fully understand human intentions, reduce "hallucinations" and unsafe outputs, we conducted Human-Feedback Training (HFT) based on the model fine-tuned with instructions. In the training process, we used human feedback reinforcement learning (RM, PPO).
We implemented the HFT training process on an internally developed framework, which can use a minimum of 8 40GB A100 GPUs to complete the full parameter training of Ziya-Writing-LLaMA-13B-v1. In the PPO training, we did not limit the length of the generated samples to ensure the accuracy of rewards for long-text tasks. The total experience pool size for each training exceeded 100k samples, ensuring the sufficiency of the training.
Performance
The evaluation of the quality of a writing task is quite subjective, making it difficult to measure with precise accuracy or satisfaction score. Therefore, we've used an anonymous multi-person Side-by-Side evaluation mechanism, and have collected 100 pieces of writing instruction data of different difficulties for evaluation. We will also make this evaluation set public in the future.
We use the win rate as an indicator of the quality of a model. The formula to calculate a model's win rate is as follows:
Win Rate = (Number of wins for the model + Number of draws / 2) / Total number of annotations
Generally, since most language models generate responses based on sampling, hence, a win rate greater than 55% indicates that the model significantly outperforms another model, a win rate less than 45% shows that the model clearly lags behind, and a win rate between 45% and 55% signifies that the two models are essentially on par.
Ziya-Writing-LLaMa-13B-v1 |
Average Win Rate |
Maximum Win Rate |
Minimum Win Rate |
vs Ziya-LLaMa-13B-v1.1 |
70.7 |
73.5 |
69 |
vs baichuan-vicuna-7b |
69.6 |
73.5 |
68 |
vs Moss-16B |
65.1 |
69 |
62 |
vs ChatGLM2-6B |
58.3 |
61.5 |
56 |
vs Minimax-abab5 |
52.3 |
53 |
50.5 |
vs GPT-3.5-turbo |
44.7 |
49.5 |
38 |
(Note: The maximum and minimum win rates are calculated by separately counting the annotation results of each annotator, and the average win rate is calculated by aggregating the annotation results of all annotators.)
💻 Usage Examples
Basic Usage
Due to the licensing restrictions of LLaMA weights, this model cannot be used for commercial purposes. Please strictly abide by the LLaMA usage policy.
from transformers import AutoTokenizer
from transformers import LlamaForCausalLM
import torch
device = torch.device("cuda")
query="帮我写一份去西安的旅游计划"
model = LlamaForCausalLM.from_pretrained("IDEA-CCNL/Ziya-Writing-LLaMa-13B-v1", torch_dtype=torch.float16, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("IDEA-CCNL/Ziya-Writing-LLaMa-13B-v1", use_fast=False)
inputs = '<human>:' + query.strip() + '\n<bot>:'
input_ids = tokenizer(inputs, return_tensors="pt").input_ids.to(device)
generate_ids = model.generate(
input_ids,
max_new_tokens=2048,
do_sample = True,
top_p = 0.85,
temperature = 0.85,
repetition_penalty=1.,
eos_token_id=2,
bos_token_id=1,
pad_token_id=0)
output = tokenizer.batch_decode(generate_ids)[0]
print(output)
Advanced Usage
📄 License
This project is licensed under the GPL-3.0 license.
🔧 Technical Details
Citation
If you are using the resource for your work, please cite the our paper:
@article{fengshenbang,
author = {Jiaxing Zhang and Ruyi Gan and Junjie Wang and Yuxiang Zhang and Lin Zhang and Ping Yang and Xinyu Gao and Ziwei Wu and Xiaoqun Dong and Junqing He and Jianheng Zhuo and Qi Yang and Yongfeng Huang and Xiayu Li and Yanghan Wu and Junyu Lu and Xinyu Zhu and Weifeng Chen and Ting Han and Kunhao Pan and Rui Wang and Hao Wang and Xiaojun Wu and Zhongshen Zeng and Chongpei Chen},
title = {Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence},
journal = {CoRR},
volume = {abs/2209.02970},
year = {2022}
}
You can also cite our website:
@misc{Fengshenbang-LM,
title={Fengshenbang-LM},
author={IDEA-CCNL},
year={2021},
howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
}
⚠️ Important Note
Due to the licensing restrictions of LLaMA weights, this model cannot be used for commercial purposes. Please strictly abide by the LLaMA usage policy.