rwkv7-0.1B-g1开源模型 - 支持多语言处理且具备深度思考能力

首页

Rwkv7 0.1B G1

由 fla-hub 开发

基于Flash线性注意力机制的RWKV-7 g1模型，支持多语言处理并具备深度思考能力

大型语言模型

Transformers

支持多种语言开源协议:Apache-2.0 #多语言深度思考 #线性注意力机制 #长文本生成

下载量 377

发布时间 : 3/10/2025

模型简介

这是一个1.91亿参数的多语言大语言模型，采用RWKV7架构，支持英语、中文等多种语言，具备深度思考能力，适用于文本生成等任务。

模型特点

多语言支持

支持英语、中文、日语、韩语、法语、阿拉伯语、西班牙语和葡萄牙语等多种语言处理

深度思考能力

g1模型系列融入了深度思考能力，可生成更高质量的文本

高效注意力机制

采用Flash线性注意力机制，提高模型效率

模型能力

多语言文本生成

对话系统

内容创作

使用案例

对话系统

智能助手

用于构建多语言智能对话助手

能够生成连贯、有逻辑的对话回复

内容创作

多语言内容生成

生成各种语言的新闻、故事等内容

🚀 rwkv7-0.1B-g1

这是一个基于Flash线性注意力机制的RWKV-7 g1模型。g1系列模型增加了大量数据，并融入了深度思考能力。

🚀 快速开始

在使用此模型前，请安装flash-linear-attention和最新版本的transformers：

pip install git+https://github.com/fla-org/flash-linear-attention
pip install 'transformers>=4.48.0'

✨ 主要特性

多语言支持：支持英语、中文、日语、韩语、法语、阿拉伯语、西班牙语和葡萄牙语等多种语言。
深度思考能力：g1模型系列融入了深度思考能力。

📦 安装指南

安装flash-linear-attention和最新版本的transformers：

pip install git+https://github.com/fla-org/flash-linear-attention
pip install 'transformers>=4.48.0'

💻 使用示例

基础用法

你可以像使用其他HuggingFace模型一样使用此模型：

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained('fla-hub/rwkv7-0.1B-g1', trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained('fla-hub/rwkv7-0.1B-g1', trust_remote_code=True)
model = model.cuda() # Supported on Nvidia/AMD/Intel eg. model.xpu()
prompt = "What is a large language model?"
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True  # Default is True, set to False to disable thinking
)

model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=1024,
    do_sample=True,
    temperature=1.0,
    top_p=0.3,
    repetition_penalty=1.2
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=False)[0]
print(response)

📚 详细文档

模型详情

开发者：Bo Peng, Yu Zhang, Songlin Yang, Ruichong Zhang
资助方：RWKV项目（隶属于LF AI & Data基金会）
模型类型：RWKV7
支持语言（NLP）：多语言
许可证：Apache-2.0
参数数量：1.91亿
分词器：RWKV World分词器
词汇量：65,536

模型来源

仓库：https://github.com/fla-org/flash-linear-attention ; https://github.com/BlinkDL/RWKV-LM
论文：https://arxiv.org/abs/2503.14456

训练数据

该模型在World v3.5上进行训练，总共有超过5万亿个标记。

🔧 技术细节

常见问题解答

Q: safetensors元数据为空。 A: 将transformers升级到>=4.48.0：pip install 'transformers>=4.48.0'

思考提示

<|rwkv_tokenizer_end_of_text|>User: <你的问题>

Assistant: <think

不要关闭<think的括号！

提示的额外注意事项

⚠️ 重要提示

始终在你的提示前添加<|rwkv_tokenizer_end_of_text|>（标记ID = 0）。由于状态初始化问题，模型无法处理它接收到的第一个标记。

错误的提示示例：

Mathews lifted a dark brow. "Are you sure about that? I mean, wouldn't it be better to wait until Dale is home safe and sound?"

"The longer I wait to tell her, the worse it will be for both of us."

"Good luck. You're going to need it," said

模型无法回忆起 Mathews，因为它是输入的第一个标记。

正确的提示示例：

<|rwkv_tokenizer_end_of_text|>Mathews lifted a dark brow. "Are you sure about that? I mean, wouldn't it be better to wait until Dale is home safe and sound?"

"The longer I wait to tell her, the worse it will be for both of us."

"Good luck. You're going to need it," said

模型将按预期输出 Mathews。

没有这个标记时：lambada_openai ppl=13.84 acc=48.13% 添加这个标记后：lambada_openai ppl=12.36 acc=49.12%

注意：这种现象在Transformers中非常罕见，但在RNN中很明显。我们推测模型使用第一个标记来固定状态，以便更好地从后续标记中获取信息。