ARWKV-R1-7BオープンソースAIモデル - 纯RNNによる効率的な循环メカニズムで様々なタスクを処理

ホーム

ARWKV R1 7B

RWKV-Red-Teamによって開発

純粋なRNNベースの70億パラメーターモデルで、知識蒸留によって訓練され、RWKV-7の効率的なリカレントメカニズムとセルフアテンションのないアーキテクチャを示しています。

大規模言語モデル

Transformers

複数言語対応オープンソースライセンス:Apache-2.0 #純RNNアーキテクチャ #効率的な知識蒸留 #一定のVRAM使用量

ダウンロード数 113

リリース時間 : 2/7/2025

モデル概要

ARWKV-R1-7Bは、RWKV-7の時間混合とTransformer MLPを組み合わせたハイブリッドアーキテクチャモデルで、テキスト生成タスクに特化しており、効率的なリカレントメカニズムと一定のVRAM使用量を特徴としています。

モデル特徴

効率的なリカレントメカニズム

RWKV-7の効率的なリカレントメカニズムを採用し、セルフアテンションがなく、完全にO(n)の複雑さです。

一定のVRAM使用量

モデルは推論プロセス中に一定のVRAM使用量を維持し、単一GPUでのトレーニングと推論に適しています。

知識蒸留トレーニング

DeepSeek-R1-Distill-Qwen-1.5Bからの3段階の知識蒸留トレーニングを実施しました。

ハイブリッドアーキテクチャ

RWKV-7の時間混合とTransformer MLPの利点を組み合わせ、モデルの性能を向上させました。

モデル能力

テキスト生成

質問応答システム

知識蒸留

使用事例

質問応答システム

世界レベルの質問応答AI

正確で簡潔な回答を提供し、さまざまな質問応答シナリオに適しています。

MMLUベンチマークテストで67.25点を達成しました。

数学的推論

数学問題の解答

基本的な数学問題を解答でき、教育シナリオに適しています。

GSM8Kベンチマークテストで56.06点を達成しました。

🚀 ARWKV🪿

このモデルは、RWKV-7の時間混合とTransformerのMLPを備えたテキスト生成モデルです。RWKV-7の効率的な再帰メカニズムを活用し、VRAM使用量が一定で、シングルGPUでのトレーニングが可能です。

🚀 クイックスタート

AMD Radeon GPUでの推論

git clone https://github.com/MollySophia/llama.cpp.git -b rwkv-v7
cd llama.cpp
HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -R)" \
    cmake -S . -B build -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx1030 -DCMAKE_BUILD_TYPE=Release \
    && cmake --build build --config Release -- -j 16
cd ./build/bin

safetensorモデルをggufに変換

python ./convert_hf_to_gguf.py [model_dir]

モデルの量子化

./llama-quantize [model_dir] [Quantization accuracy]

llama-serverを使用したWebUIでのモデル推論

/llama-server -m [model_dir] -t [use_cpu_thread_number] -ngl 99 --host [host_number] --port [port_number]

Radeon 7000シリーズはgfx1100を、Radeon 6000シリーズはgfx1030を使用します。

Nvidia GPUでの推論

git clone https://github.com/MollySophia/llama.cpp.git -b rwkv-v7
cd llama.cpp
cmake -B build -DGGML_CUDA=ON
cmake --build build --config Release
cd ./build/bin

safetensorモデルをggufに変換

python ./convert_hf_to_gguf.py [model_dir]

モデルの量子化

./llama-quantize [model_dir] [Quantization_accuracy]

llama-serverを使用したWebUIでのモデル推論

/llama-server -m [model_dir] -t [use_cpu_thread_number] -ngl 99 --host [host_number] --port [port_number]

✨ 主な機能

効率的な再帰メカニズム：RWKV-7の時間混合を使用し、効率的な再帰処理が可能です。
低VRAM使用量：VRAM使用量が一定で、シングルGPUでのトレーニングが可能です。
高速推論：自注意力機構を使用せず、完全にO(n)の計算量です。

📦 インストール

pip3 install --upgrade transformers rwkv-fla

トレーニング前に、以下のコマンドを実行します。

export WKV_MODE=chunk

💻 使用例

基本的な使用法

from transformers import AutoModelForCausalLM, AutoTokenizer


model = AutoModelForCausalLM.from_pretrained(
    "RWKV-Red-Team/ARWKV-R1-7B",
    device_map="auto",
    torch_dtype=torch.float16,
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(
    "RWKV-Red-Team/ARWKV-R1-7B"
)

system_prompt = "You are a world class trivia AI - provide accurate, succinct responses. "
messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": prompt}]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
text = text + "<think>"
print(text)
model_inputs = tokenizer([text], return_tensors="pt").to(device)

streamer = TextIteratorStreamer(tokenizer, skip_prompt=False, skip_special_tokens=False)


generation_kwargs = dict(model_inputs, streamer=streamer, max_new_tokens=8192, do_sample=True,tokenizer=tokenizer,stop_strings=["<｜end▁of▁sentence｜>"])
thread = threading.Thread(target=model.generate, kwargs=generation_kwargs)
thread.start()

print("Streaming output:")
for new_text in streamer:
    print(new_text, end="", flush=True)

thread.join()

出力例は以下の通りです。

<｜begin▁of▁sentence｜>You are a world class trivia AI - provide accurate, succinct responses. <｜User｜>The world's largest rainforest, home to approximately three million species of plants and animals, is named after which river?<｜Assistant｜><think>
Okay, so I'm trying to solve this question about the world's largest rainforest and which river it's named after. Hmm, first, I think rainforest names often have links related to the region it's in. The most famous rainforest in the world is the Amazon. I remember hearing a lot about it being called that because rainforests are connected to specific river systems. 

Now, I'm trying to recall which river is named after the Amazon. I think it's the Amazon River. But I want to be sure. Let me see... the Amazon is a major rainforest located in South America. The Amazon River flows through it, which is why it's named after it. That makes sense because it's a very important river. I recall reading somewhere that all the rainforests are named after rivers related to their regions. So if the Amazon is named after its River, then the name would naturally be related to its source.

I wonder if it's the Amazon itself that's named after it, or another river named after it. But the official name for the Amazon is the Amazon Rainforest. The most significant rainforest in the world is the Amazon, and its name probably started with river-sounding names.
</think>

The largest rainforest located in South America is the Amazon. It is named after the river named after it, which is the Amazon River. Therefore, the Amazon River is the name given to the Amazon Rain Forest.

📚 ドキュメント

ベンチマーク

	Qwen2.5-7B-Instruct	ARWKV_7B	ARWKV_R1_7B
MMLU	`71.72`	`62.41`	`67.25` ↗️
GSM8K	`82.34`	`39.95`	`56.06` ↗️
WinoGrande	`71.35`	`68.67`	`51.93` ↘️
IfEval	`73.62`	`52.16`	`60.31` ↗️
Arc-c	`54.86`	`52.22`	`44.11` ↘️

主要機能の詳細

コンポーネント	仕様	備考
アーキテクチャ	RWKV-7 TimeMix + SwiGLU	ハイブリッド設計
コンテキストウィンドウ	2048トレーニングCTX	プレビュー版の制限
トレーニングトークン	40M	知識蒸留に特化
精度	FP16推論を推奨(16G VRAM必要)	BF16より15%向上

アーキテクチャのハイライト

コア変更フロー

Transformer Decoder Layer:
- Multi-head Latent Attention(MLA)
+ RWKV-7 Time Mixing (Eq.3)
- RoPE Positional Encoding
+ State Recurrence
= Hybrid Layer Output