đ rwkv7-1.5B-world
This is an RWKV-7 model in flash-linear attention format, capable of handling multiple languages such as English, Chinese, Japanese, etc.
đ Quick Start
Before using this model, you need to install flash-linear-attention
and the latest version of transformers
:
pip install git+https://github.com/fla-org/flash-linear-attention
pip install 'transformers>=4.48.0'
⨠Features
- Multilingual Support: Supports multiple languages including English, Chinese, Japanese, Korean, French, Arabic, Spanish, and Portuguese.
- Flash-Linear Attention Format: Utilizes the flash-linear attention format for better performance.
đĻ Installation
pip install git+https://github.com/fla-org/flash-linear-attention
pip install 'transformers>=4.48.0'
đģ Usage Examples
Basic Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained('fla-hub/rwkv7-1.5B-world', trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained('fla-hub/rwkv7-1.5B-world', trust_remote_code=True)
model = model.cuda()
prompt = "What is a large language model?"
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=4096,
do_sample=True,
temperature=1.0,
top_p=0.3,
repetition_penalty=1.2
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=False)[0]
print(response)
đ Documentation
Model Details
Model Description
- Developed by: Bo Peng, Yu Zhang, Songlin Yang, Ruichong Zhang
- Funded by: RWKV Project (Under LF AI & Data Foundation)
- Model type: RWKV7
- Language(s) (NLP): English, Chinese, Japanese, Korean, French, Arabic, Spanish, Portuguese
- License: Apache-2.0
- Parameter count: 1.52B
- Tokenizer: RWKV World tokenizer
- Vocabulary size: 65,536
Model Sources
Training Details
Training Data
This model is trained on the World v3 with a total of 3.119 trillion tokens.
Training Hyperparameters
- Training regime: bfloat16, lr 4e-4 to 1e-5 "delayed" cosine decay, wd 0.1 (with increasing batch sizes during the middle)
- Final Loss: 1.9965
- Token Count: 3.119 trillion
Evaluation
Metrics
lambada_openai
:
- Before conversion: ppl 4.13 acc 69.4%
- After conversion: ppl 4.26 acc 68.8% (without apply temple)
FAQ
â ī¸ Important Note
If the safetensors metadata is none, you need to upgrade transformers to >=4.48.0: pip install 'transformers>=4.48.0'
đ License
This model is licensed under the Apache-2.0 license.