RWKV7-1.5B-world Open-Source Multilingual Text Generation Model - Freely Deploy and Easily Produce Content in Multiple Languages

Rwkv7 1.5B World

Developed by fla-hub

The RWKV-7 model adopts a flash linear attention architecture and supports multilingual text generation tasks.

Supports Multiple LanguagesOpen Source License:Apache-2.0 #Flash Linear Attention #Multilingual Text Generation #1.5B Parameter Scale

Downloads 632

Release Time : 1/28/2025

Model Overview

This is an RWKV-7 model based on the flash linear attention architecture, primarily designed for multilingual text generation tasks, supporting various languages including Chinese and English.

Model Features

Flash Linear Attention Architecture

Utilizes an efficient flash linear attention architecture to enhance computational efficiency.

Multilingual Support

Supports 8 languages, including Chinese and English, suitable for multilingual text generation tasks.

Large-Scale Training Data

Trained on the World v3 dataset with a total token count of 3.119 trillion, delivering superior model performance.

Model Capabilities

Multilingual Text Generation

Dialogue Generation

Content Creation

Use Cases

Dialogue Systems

Intelligent Customer Service

Used to build multilingual intelligent customer service systems for automated responses to user queries.

Content Generation

Multilingual Article Generation

Generates multilingual articles, news, or story content.

🚀 rwkv7-1.5B-world

This is an RWKV-7 model in flash-linear attention format, capable of handling multiple languages such as English, Chinese, Japanese, etc.

🚀 Quick Start

Before using this model, you need to install flash-linear-attention and the latest version of transformers:

pip install git+https://github.com/fla-org/flash-linear-attention
pip install 'transformers>=4.48.0'

✨ Features

Multilingual Support: Supports multiple languages including English, Chinese, Japanese, Korean, French, Arabic, Spanish, and Portuguese.
Flash-Linear Attention Format: Utilizes the flash-linear attention format for better performance.

📦 Installation

pip install git+https://github.com/fla-org/flash-linear-attention
pip install 'transformers>=4.48.0'

💻 Usage Examples

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained('fla-hub/rwkv7-1.5B-world', trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained('fla-hub/rwkv7-1.5B-world', trust_remote_code=True)

model = model.cuda() # Supported on Nvidia/AMD/Intel eg. model.xpu()
prompt = "What is a large language model?"
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=4096,
    do_sample=True,
    temperature=1.0,
    top_p=0.3,
    repetition_penalty=1.2
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=False)[0]
print(response)

📚 Documentation

Model Details

Model Description

Developed by: Bo Peng, Yu Zhang, Songlin Yang, Ruichong Zhang
Funded by: RWKV Project (Under LF AI & Data Foundation)
Model type: RWKV7
Language(s) (NLP): English, Chinese, Japanese, Korean, French, Arabic, Spanish, Portuguese
License: Apache-2.0
Parameter count: 1.52B
Tokenizer: RWKV World tokenizer
Vocabulary size: 65,536

Model Sources

Repository: https://github.com/fla-org/flash-linear-attention ; https://github.com/BlinkDL/RWKV-LM
Paper: https://huggingface.co/papers/2503.14456

Training Details

Training Data

This model is trained on the World v3 with a total of 3.119 trillion tokens.

Training Hyperparameters

Training regime: bfloat16, lr 4e-4 to 1e-5 "delayed" cosine decay, wd 0.1 (with increasing batch sizes during the middle)
Final Loss: 1.9965
Token Count: 3.119 trillion

Evaluation

Metrics

lambada_openai:

Before conversion: ppl 4.13 acc 69.4%
After conversion: ppl 4.26 acc 68.8% (without apply temple)

FAQ

⚠️ Important Note

If the safetensors metadata is none, you need to upgrade transformers to >=4.48.0: pip install 'transformers>=4.48.0'

📄 License

This model is licensed under the Apache-2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご