🚀 rwkv7-0.1B-g1
这是一个基于Flash线性注意力机制的RWKV-7 g1模型。g1
系列模型增加了大量数据,并融入了深度思考能力。
🚀 快速开始
在使用此模型前,请安装flash-linear-attention
和最新版本的transformers
:
pip install git+https://github.com/fla-org/flash-linear-attention
pip install 'transformers>=4.48.0'
✨ 主要特性
- 多语言支持:支持英语、中文、日语、韩语、法语、阿拉伯语、西班牙语和葡萄牙语等多种语言。
- 深度思考能力:
g1
模型系列融入了深度思考能力。
📦 安装指南
安装flash-linear-attention
和最新版本的transformers
:
pip install git+https://github.com/fla-org/flash-linear-attention
pip install 'transformers>=4.48.0'
💻 使用示例
基础用法
你可以像使用其他HuggingFace模型一样使用此模型:
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained('fla-hub/rwkv7-0.1B-g1', trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained('fla-hub/rwkv7-0.1B-g1', trust_remote_code=True)
model = model.cuda()
prompt = "What is a large language model?"
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=1024,
do_sample=True,
temperature=1.0,
top_p=0.3,
repetition_penalty=1.2
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=False)[0]
print(response)
📚 详细文档
模型详情
- 开发者:Bo Peng, Yu Zhang, Songlin Yang, Ruichong Zhang
- 资助方:RWKV项目(隶属于LF AI & Data基金会)
- 模型类型:RWKV7
- 支持语言(NLP):多语言
- 许可证:Apache-2.0
- 参数数量:1.91亿
- 分词器:RWKV World分词器
- 词汇量:65,536
模型来源
- 仓库:https://github.com/fla-org/flash-linear-attention ; https://github.com/BlinkDL/RWKV-LM
- 论文:https://arxiv.org/abs/2503.14456
训练数据
该模型在World v3.5上进行训练,总共有超过5万亿个标记。
🔧 技术细节
常见问题解答
Q: safetensors元数据为空。
A: 将transformers
升级到>=4.48.0:pip install 'transformers>=4.48.0'
思考提示
<|rwkv_tokenizer_end_of_text|>User: <你的问题>
Assistant: <think
不要关闭<think
的括号!
提示的额外注意事项
⚠️ 重要提示
始终在你的提示前添加<|rwkv_tokenizer_end_of_text|>
(标记ID = 0)。由于状态初始化问题,模型无法处理它接收到的第一个标记。
错误的提示示例:
Mathews lifted a dark brow. "Are you sure about that? I mean, wouldn't it be better to wait until Dale is home safe and sound?"
"The longer I wait to tell her, the worse it will be for both of us."
"Good luck. You're going to need it," said
模型无法回忆起 Mathews
,因为它是输入的第一个标记。
正确的提示示例:
<|rwkv_tokenizer_end_of_text|>Mathews lifted a dark brow. "Are you sure about that? I mean, wouldn't it be better to wait until Dale is home safe and sound?"
"The longer I wait to tell her, the worse it will be for both of us."
"Good luck. You're going to need it," said
模型将按预期输出 Mathews
。
没有这个标记时:lambada_openai ppl=13.84 acc=48.13%
添加这个标记后:lambada_openai ppl=12.36 acc=49.12%
注意:这种现象在Transformers中非常罕见,但在RNN中很明显。我们推测模型使用第一个标记来固定状态,以便更好地从后续标记中获取信息。
📄 许可证
本模型使用Apache-2.0许可证。