rwkv7-0.1B-g1開源模型 - 支持多語言處理且具備深度思考能力

首頁

Rwkv7 0.1B G1

由fla-hub開發

基於Flash線性注意力機制的RWKV-7 g1模型，支持多語言處理並具備深度思考能力

大型語言模型

Transformers

支持多種語言開源協議:Apache-2.0 #多語言深度思考 #線性注意力機制 #長文本生成

下載量 377

發布時間 : 3/10/2025

模型概述

這是一個1.91億參數的多語言大語言模型，採用RWKV7架構，支持英語、中文等多種語言，具備深度思考能力，適用於文本生成等任務。

模型特點

多語言支持

支持英語、中文、日語、韓語、法語、阿拉伯語、西班牙語和葡萄牙語等多種語言處理

深度思考能力

g1模型系列融入了深度思考能力，可生成更高質量的文本

高效注意力機制

採用Flash線性注意力機制，提高模型效率

模型能力

多語言文本生成

對話系統

內容創作

使用案例

對話系統

智能助手

用於構建多語言智能對話助手

能夠生成連貫、有邏輯的對話回覆

內容創作

多語言內容生成

生成各種語言的新聞、故事等內容

🚀 rwkv7-0.1B-g1

這是一個基於Flash線性注意力機制的RWKV-7 g1模型。g1系列模型增加了大量數據，並融入了深度思考能力。

🚀 快速開始

在使用此模型前，請安裝flash-linear-attention和最新版本的transformers：

pip install git+https://github.com/fla-org/flash-linear-attention
pip install 'transformers>=4.48.0'

✨ 主要特性

多語言支持：支持英語、中文、日語、韓語、法語、阿拉伯語、西班牙語和葡萄牙語等多種語言。
深度思考能力：g1模型系列融入了深度思考能力。

📦 安裝指南

安裝flash-linear-attention和最新版本的transformers：

pip install git+https://github.com/fla-org/flash-linear-attention
pip install 'transformers>=4.48.0'

💻 使用示例

基礎用法

你可以像使用其他HuggingFace模型一樣使用此模型：

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained('fla-hub/rwkv7-0.1B-g1', trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained('fla-hub/rwkv7-0.1B-g1', trust_remote_code=True)
model = model.cuda() # Supported on Nvidia/AMD/Intel eg. model.xpu()
prompt = "What is a large language model?"
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True  # Default is True, set to False to disable thinking
)

model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=1024,
    do_sample=True,
    temperature=1.0,
    top_p=0.3,
    repetition_penalty=1.2
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=False)[0]
print(response)

📚 詳細文檔

模型詳情

開發者：Bo Peng, Yu Zhang, Songlin Yang, Ruichong Zhang
資助方：RWKV項目（隸屬於LF AI & Data基金會）
模型類型：RWKV7
支持語言（NLP）：多語言
許可證：Apache-2.0
參數數量：1.91億
分詞器：RWKV World分詞器
詞彙量：65,536

模型來源

倉庫：https://github.com/fla-org/flash-linear-attention ; https://github.com/BlinkDL/RWKV-LM
論文：https://arxiv.org/abs/2503.14456

訓練數據

該模型在World v3.5上進行訓練，總共有超過5萬億個標記。

🔧 技術細節

常見問題解答

Q: safetensors元數據為空。 A: 將transformers升級到>=4.48.0：pip install 'transformers>=4.48.0'

思考提示

<|rwkv_tokenizer_end_of_text|>User: <你的問題>

Assistant: <think

不要關閉<think的括號！

提示的額外注意事項

⚠️ 重要提示

始終在你的提示前添加<|rwkv_tokenizer_end_of_text|>（標記ID = 0）。由於狀態初始化問題，模型無法處理它接收到的第一個標記。

錯誤的提示示例：

Mathews lifted a dark brow. "Are you sure about that? I mean, wouldn't it be better to wait until Dale is home safe and sound?"

"The longer I wait to tell her, the worse it will be for both of us."

"Good luck. You're going to need it," said

模型無法回憶起 Mathews，因為它是輸入的第一個標記。

正確的提示示例：

<|rwkv_tokenizer_end_of_text|>Mathews lifted a dark brow. "Are you sure about that? I mean, wouldn't it be better to wait until Dale is home safe and sound?"

"The longer I wait to tell her, the worse it will be for both of us."

"Good luck. You're going to need it," said

模型將按預期輸出 Mathews。

沒有這個標記時：lambada_openai ppl=13.84 acc=48.13% 添加這個標記後：lambada_openai ppl=12.36 acc=49.12%

注意：這種現象在Transformers中非常罕見，但在RNN中很明顯。我們推測模型使用第一個標記來固定狀態，以便更好地從後續標記中獲取信息。