🚀 RWKV-4 | 1B5参数聊天版本(Raven)模型卡片
RWKV是一个由Bo Peng领导的项目。你可以通过Johan Wind的博客文章此处和此处了解更多关于该模型架构的信息。还可以通过加入RWKV Discord服务器来深入了解这个项目。

🚀 快速开始
模型简介
以下是来自原始仓库的描述:
RWKV是一种具有Transformer级别大语言模型性能的循环神经网络(RNN)。它可以像GPT一样直接进行训练(可并行化)。它结合了RNN和Transformer的优点——性能出色、推理速度快、节省显存、训练速度快、具有“无限”上下文长度,并且能免费获得句子嵌入。
数据集
✨ 主要特性
RWKV结合了RNN和Transformer的优势,具备以下特性:
- 具有Transformer级别的大语言模型性能。
- 可以像GPT一样直接进行训练,支持并行化。
- 推理速度快,节省显存。
- 训练速度快,拥有“无限”上下文长度,还能免费获得句子嵌入。
📚 详细文档
模型细节
架构的详细信息可以在上述提到的博客文章以及Hugging Face集成的博客文章中找到。
使用方法
将原始权重转换为Hugging Face格式
你可以使用convert_rwkv_checkpoint_to_hf.py
脚本,指定原始权重的仓库ID、文件名和输出目录。你还可以选择通过传递--push_to_hub
标志和--model_name
参数,直接将转换后的模型推送到Hugging Face Hub上,指定转换后的权重推送位置。
python convert_rwkv_checkpoint_to_hf.py --repo_id RAW_HUB_REPO --checkpoint_file RAW_FILE --output_dir OUTPUT_DIR --push_to_hub --model_name dummy_user/converted-rwkv
生成文本
你可以使用AutoModelForCausalLM
和AutoTokenizer
类从模型中生成文本。展开以下部分,了解如何在不同场景下运行模型:
“Raven”模型需要以特定方式进行提示,更多相关信息可查看集成博客文章。
在CPU上运行模型
点击展开
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("RWKV/rwkv-raven-1b5")
tokenizer = AutoTokenizer.from_pretrained("RWKV/rwkv-raven-1b5")
prompt = "\nIn a shocking finding, scientist discovered a herd of dragons living in a remote, previously unexplored valley, in Tibet. Even more surprising to the researchers was the fact that the dragons spoke perfect Chinese."
inputs = tokenizer(prompt, return_tensors="pt")
output = model.generate(inputs["input_ids"], max_new_tokens=40)
print(tokenizer.decode(output[0].tolist(), skip_special_tokens=True))
在单GPU上运行模型
点击展开
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("RWKV/rwkv-raven-1b5").to(0)
tokenizer = AutoTokenizer.from_pretrained("RWKV/rwkv-raven-1b5")
prompt = "\nIn a shocking finding, scientist discovered a herd of dragons living in a remote, previously unexplored valley, in Tibet. Even more surprising to the researchers was the fact that the dragons spoke perfect Chinese."
inputs = tokenizer(prompt, return_tensors="pt").to(0)
output = model.generate(inputs["input_ids"], max_new_tokens=40)
print(tokenizer.decode(output[0].tolist(), skip_special_tokens=True))
在GPU上以半精度运行模型
点击展开
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("RWKV/rwkv-raven-1b5", torch_dtype=torch.float16).to(0)
tokenizer = AutoTokenizer.from_pretrained("RWKV/rwkv-raven-1b5")
prompt = "\nIn a shocking finding, scientist discovered a herd of dragons living in a remote, previously unexplored valley, in Tibet. Even more surprising to the researchers was the fact that the dragons spoke perfect Chinese."
inputs = tokenizer(prompt, return_tensors="pt").to(0)
output = model.generate(inputs["input_ids"], max_new_tokens=40)
print(tokenizer.decode(output[0].tolist(), skip_special_tokens=True))
在多个GPU上运行模型
点击展开
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("RWKV/rwkv-raven-1b5", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("RWKV/rwkv-raven-1b5")
prompt = "\nIn a shocking finding, scientist discovered a herd of dragons living in a remote, previously unexplored valley, in Tibet. Even more surprising to the researchers was the fact that the dragons spoke perfect Chinese."
inputs = tokenizer(prompt, return_tensors="pt").to(0)
output = model.generate(inputs["input_ids"], max_new_tokens=40)
print(tokenizer.decode(output[0].tolist(), skip_special_tokens=True))
📄 许可证
如果你使用此模型,请考虑引用原始工作,原始仓库地址为此处。