xgen-7b-8k-base Open-source Large Language Model - Free to Use with Support for 8K Long Sequence Input in Conversations

Xgen 7b 8k Base

Developed by Salesforce

A 7B-parameter large language model released by Salesforce AI Research, supporting 8K long sequence input, open-sourced under Apache-2.0 license

Large Language Model

Transformers

Open Source License:Apache-2.0 #8K long text processing #7B parameter scale #Autoregressive generation

Downloads 997

Release Time : 6/28/2023

Model Overview

XGen-7B-8K is a large language model focused on long-sequence modeling, supporting a context window of up to 8000 tokens, suitable for scenarios requiring long text processing

Model Features

Long sequence support

Supports long context windows of 8000 tokens, suitable for processing long documents and complex sequences

Open-source license

Adopts the Apache-2.0 license, allowing for commercial and research use

Efficient training

Utilizes optimized training methods for long-sequence modeling

Model Capabilities

Long text generation

Text continuation

Language understanding

Use Cases

Text generation

Long document generation

Generates long text content such as technical documents and reports

Research applications

Long sequence modeling research

Used for studying long text understanding and generation mechanisms

🚀 XGen-7B-8K-Base

This is an official research release of the XGen model family (7B) by Salesforce AI Research. It presents a solution for long - sequence modeling, offering pre - trained models with different sequence lengths and an instruction - finetuned model for research purposes.

🚀 Quick Start

The training data for the models are tokenized with OpenAI Tiktoken library. To use this model, install the package via pip:

pip install tiktoken

The models can be used as auto - regressive samplers as follows:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Salesforce/xgen-7b-8k-base", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("Salesforce/xgen-7b-8k-base", torch_dtype=torch.bfloat16)
inputs = tokenizer("The world is", return_tensors="pt")
sample = model.generate(**inputs, max_length=128)
print(tokenizer.decode(sample[0]))

✨ Features

Base models

XGen-7B-4K-Base: XGen-7B model pre - trained under 4K sequence length.
- License: Apache - 2.0
XGen-7B-8K-Base: XGen-7B model pre - trained under 8K sequence length.
- License: Apache - 2.0

Instruction - finetuned models

Supervised finetuned model on public domain instructional data. Released for research purpose only.

XGen-7B-8K-Inst

📚 Documentation

Title: Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length

Authors: Erik Nijkamp*, Tian Xie*, Hiroaki Hayashi*, Bo Pang*, Congying Xia*, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryscinski, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien - Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Rayhan Joty, Caiming Xiong.

(* indicates equal contribution)

Correspondence to: Shafiq Rayhan Joty, Caiming Xiong

🔧 Technical Details

This release is for research purposes only in support of an academic paper. Our models, datasets, and code are not specifically designed or evaluated for all downstream purposes. We strongly recommend users evaluate and address potential concerns related to accuracy, safety, and fairness before deploying this model. We encourage users to consider the common limitations of AI, comply with applicable laws, and leverage best practices when selecting use cases, particularly for high - risk scenarios where errors or misuse could significantly impact people’s lives, rights, or safety. For further guidance on use cases, refer to our AUP and AI AUP.

📄 License

The license for this project is Apache - 2.0.

📖 Citation

@misc{XGen,
  title={Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length},
  author={Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause, Senthil Purushwalkam, Tong Niu, Wojciech Kryscinski, Lidiya Murakhovs'ka, Prafulla Kumar Choubey, Alex Fabbri, Ye Liu, Rui Meng, Lifu Tu, Meghana Bhat, Chien - Sheng Wu, Silvio Savarese, Yingbo Zhou, Shafiq Rayhan Joty, Caiming Xiong},
  howpublished={ArXiv},
  year={2023},
  url={https://arxiv.org/abs/2309.03450}
}

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご