đ NorGPT-3B-summarization-peft
NorGPT-3B-summarization-peft is a model trained on the NorGPT-3B base for text summarization tasks, offering efficient and accurate summarization capabilities.
đ Quick Start
NorGPT-3B-summarization-peft is trained on top of NorGPT-3B model using RLHF strategy on NO-CNN-DailyMail dataset.
Different from step 2 in the original RLHF, we trained the reward model by estimating the semantic similarity between the candidate generated text and the human annotated summary (golden summary) using the NorBERT model. Generated summaries with higher cosine similarity to the golden summary will be ranked higher in the training of the reward model.
Prompt format:
Summarise the article:\\n{article} |||\\n{positive_sample}
Inference prompt:
Summarise the article:\\n{article} |||\\n
⨠Features
- Trained on the NorGPT-3B base model, leveraging its powerful language understanding capabilities.
- Uses the RLHF strategy for training, improving the quality of generated summaries.
- Estimates semantic similarity using the NorBERT model to train the reward model, enhancing the relevance of summaries.
đĻ Installation
No specific installation steps are provided in the original document.
đģ Usage Examples
Basic Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "NorGLM/NorGPT-3B-rfhl-summarization"
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map='auto',
torch_dtype=torch.bfloat16
)
Advanced Usage
Load the model to evaluate on the test set of NO-CNN-DailyMail dataset:
def generate_texts(model, tokenizer, prompts, max_seq_length=200, do_sample=True, top_p=0.95, top_k=10):
results = []
cnt = 0
for prompt in prompts:
cnt += 1
pro_len = len(prompt.split())
if pro_len>1024:
results.append('')
continue
prompt = 'Summarise the article:\\n' + prompt + ' |||\\n'
model_inputs = tokenizer(prompt, return_tensors='pt').to(torch_device)
output = model.generate(**model_inputs, do_sample=False, max_new_tokens=max_seq_length)
result = tokenizer.decode(output[0], skip_special_tokens=True)
result = result.split("|||\\n")[-1]
results.append(result)
return results
print("--LOADING EVAL DATAS---")
eval_data = load_dataset("NorGLM/NO-CNN-DailyMail", data_files="test.csv")
prompts = eval_data['train']['article']
positive_samples = eval_data['train']['positive_sample']
print("--MAKING PREDICTIONS---")
model.eval()
output_file = <output file name>
with torch.no_grad():
results = generate_texts(model, tokenizer, prompts)
df = pd.DataFrame({'article':prompts, 'generated_text':results, 'positive_sample':positive_samples})
print("Save results to csv file...")
df.to_csv(output_file)
đ Documentation
Training Split
We split data to train on step 1-step 3 for RLHF:
Property |
Details |
step 1 |
61181 samples |
step 2 |
16798 samples |
step 3 |
9758 samples |
đ§ Technical Details
The model is trained on top of the NorGPT-3B model using the RLHF strategy. The reward model is trained by estimating semantic similarity between candidate generated text and the golden summary using the NorBERT model. This approach ranks summaries with higher cosine similarity to the golden summary higher during the training of the reward model.
đ License
The model is licensed under the CC BY-NC-SA 4.0 license.
đ Citation Information
If you find our work helpful, please cite our paper:
@article{liu2023nlebench+,
title={NLEBench+ NorGLM: A Comprehensive Empirical Analysis and Benchmark Dataset for Generative Language Models in Norwegian},
author={Liu, Peng and Zhang, Lemei and Farup, Terje Nissen and Lauvrak, Even W and Ingvaldsen, Jon Espen and Eide, Simen and Gulla, Jon Atle and Yang, Zhirong},
journal={arXiv preprint arXiv:2312.01314},
year={2023}
}