🚀 rwkv7-0.1B-g1
這是一個基於Flash線性注意力機制的RWKV-7 g1模型。g1
系列模型增加了大量數據,並融入了深度思考能力。
🚀 快速開始
在使用此模型前,請安裝flash-linear-attention
和最新版本的transformers
:
pip install git+https://github.com/fla-org/flash-linear-attention
pip install 'transformers>=4.48.0'
✨ 主要特性
- 多語言支持:支持英語、中文、日語、韓語、法語、阿拉伯語、西班牙語和葡萄牙語等多種語言。
- 深度思考能力:
g1
模型系列融入了深度思考能力。
📦 安裝指南
安裝flash-linear-attention
和最新版本的transformers
:
pip install git+https://github.com/fla-org/flash-linear-attention
pip install 'transformers>=4.48.0'
💻 使用示例
基礎用法
你可以像使用其他HuggingFace模型一樣使用此模型:
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained('fla-hub/rwkv7-0.1B-g1', trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained('fla-hub/rwkv7-0.1B-g1', trust_remote_code=True)
model = model.cuda()
prompt = "What is a large language model?"
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=1024,
do_sample=True,
temperature=1.0,
top_p=0.3,
repetition_penalty=1.2
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=False)[0]
print(response)
📚 詳細文檔
模型詳情
- 開發者:Bo Peng, Yu Zhang, Songlin Yang, Ruichong Zhang
- 資助方:RWKV項目(隸屬於LF AI & Data基金會)
- 模型類型:RWKV7
- 支持語言(NLP):多語言
- 許可證:Apache-2.0
- 參數數量:1.91億
- 分詞器:RWKV World分詞器
- 詞彙量:65,536
模型來源
- 倉庫:https://github.com/fla-org/flash-linear-attention ; https://github.com/BlinkDL/RWKV-LM
- 論文:https://arxiv.org/abs/2503.14456
訓練數據
該模型在World v3.5上進行訓練,總共有超過5萬億個標記。
🔧 技術細節
常見問題解答
Q: safetensors元數據為空。
A: 將transformers
升級到>=4.48.0:pip install 'transformers>=4.48.0'
思考提示
<|rwkv_tokenizer_end_of_text|>User: <你的問題>
Assistant: <think
不要關閉<think
的括號!
提示的額外注意事項
⚠️ 重要提示
始終在你的提示前添加<|rwkv_tokenizer_end_of_text|>
(標記ID = 0)。由於狀態初始化問題,模型無法處理它接收到的第一個標記。
錯誤的提示示例:
Mathews lifted a dark brow. "Are you sure about that? I mean, wouldn't it be better to wait until Dale is home safe and sound?"
"The longer I wait to tell her, the worse it will be for both of us."
"Good luck. You're going to need it," said
模型無法回憶起 Mathews
,因為它是輸入的第一個標記。
正確的提示示例:
<|rwkv_tokenizer_end_of_text|>Mathews lifted a dark brow. "Are you sure about that? I mean, wouldn't it be better to wait until Dale is home safe and sound?"
"The longer I wait to tell her, the worse it will be for both of us."
"Good luck. You're going to need it," said
模型將按預期輸出 Mathews
。
沒有這個標記時:lambada_openai ppl=13.84 acc=48.13%
添加這個標記後:lambada_openai ppl=12.36 acc=49.12%
注意:這種現象在Transformers中非常罕見,但在RNN中很明顯。我們推測模型使用第一個標記來固定狀態,以便更好地從後續標記中獲取信息。
📄 許可證
本模型使用Apache-2.0許可證。