RWKV7-Goose-World3-2.9B-HF Open Source Model - Free Support for Multi-language Text Generation

RWKV7 Goose World3 2.9B HF

Developed by RWKV

The RWKV-7 model adopts the flash linear attention format, supports multilingual text generation tasks, and has a parameter count of 2.9 billion.

Large Language Model

Safetensors

Supports Multiple LanguagesOpen Source License:Apache-2.0 #Flash Linear Attention #Multilingual Text Generation #2.9B Parameter Scale

Downloads 132

Release Time : 3/17/2025

Model Overview

This is a large language model based on the RWKV-7 architecture, utilizing flash linear attention technology to support text generation in multiple languages, including Chinese.

Model Features

Flash Linear Attention

Utilizes an innovative flash linear attention format to enhance model efficiency.

Multilingual Support

Supports text generation in 8 languages, including Chinese.

Large-Scale Training

Trained on the World v3 dataset, with a total token count of 3.119 trillion.

Model Capabilities

Multilingual Text Generation

Dialogue System Construction

Content Creation

Use Cases

Dialogue System

Intelligent Assistant

Build multilingual intelligent dialogue assistants.

Content Generation

Multilingual Content Creation

Generate text content in multiple languages.

🚀 rwkv7-2.9B-world

This is an RWKV-7 model in the flash-linear attention format, designed for text generation tasks.

🚀 Quick Start

Before using this model, you need to install flash-linear-attention <= 0.1.2 and the latest version of transformers:

pip install --no-use-pep517 flash-linear-attention==0.1.2
pip install 'transformers>=4.48.0'

✨ Features

Multilingual Support: Supports multiple languages including English, Chinese, Japanese, Korean, French, Arabic, Spanish, and Portuguese.
High Metrics: Achieves high accuracy in relevant tasks.
Based on Strong Base Model: Built upon the BlinkDL/rwkv-7-world base model.

📦 Installation

Install flash-linear-attention <= 0.1.2 and the latest version of transformers:

pip install --no-use-pep517 flash-linear-attention==0.1.2
pip install 'transformers>=4.48.0'

💻 Usage Examples

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained('fla-hub/rwkv7-2.9B-world', trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained('fla-hub/rwkv7-2.9B-world', trust_remote_code=True)
model = model.cuda()
prompt = "What is a large language model?"
messages = [
    {"role": "user", "content": "Who are you?"},
    {"role": "assistant", "content": "I am a GPT-3 based model."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=1024,
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=False)[0]
print(response)

📚 Documentation

Model Details

Model Description

Developed by: Bo Peng, Yu Zhang, Songlin Yang, Ruichong Zhang
Funded by: RWKV Project (Under LF AI & Data Foundation)
Model type: RWKV7
Language(s) (NLP): English
License: Apache-2.0
Parameter count: 2.9B
Tokenizer: RWKV World tokenizer
Vocabulary size: 65,536

Model Sources

Repository: https://github.com/fla-org/flash-linear-attention ; https://github.com/BlinkDL/RWKV-LM
Paper: RWKV-7 "Goose" with Expressive Dynamic State Evolution

Uses

Direct Use

You can use this model just as any other HuggingFace models, as shown in the usage example above.

Training Data

This model is trained on the World v3 with a total of 3.119 trillion tokens.

Training Hyperparameters

Training regime: bfloat16, lr 4e-4 to 1e-5 "delayed" cosine decay, wd 0.1 (with increasing batch sizes during the middle)
Final Loss: 1.8745
Token Count: 3.119 trillion

FAQ

⚠️ Important Note

If safetensors metadata is none, upgrade transformers to >=4.48.0: pip install 'transformers>=4.48.0'

📄 License

This model is released under the Apache-2.0 license.

Property	Details
Model Type	RWKV7
Training Data	Trained on the World v3 with a total of 3.119 trillion tokens.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご