A

ARWKV R1 1B5

Developed by RWKV-Red-Team
ARWKV-R1-1B5 is an early preview version of a 7-billion-parameter model based on RNN, trained through three-stage knowledge distillation from DeepSeek-R1-Distill-Qwen-1.5B, with a context length of 2k.
Downloads 164
Release Time : 2/7/2025

Model Overview

ARWKV-R1-1B5 is a hybrid design model based on RWKV-7 time mixing and Transformer MLP architecture, showcasing the efficient recurrent mechanism and the advantage of no self-attention in RWKV-7.

Model Features

Efficient Recurrent Mechanism
Based on RWKV-7's efficient recurrent mechanism, with no self-attention and fully O(n) complexity.
Constant Memory Usage
The model maintains constant memory usage during inference, making it suitable for single-GPU training and inference.
Hybrid Architecture Design
Combines RWKV-7 time mixing with Transformer MLP architecture, optimizing model performance and efficiency.

Model Capabilities

Text Generation
Multilingual Support
Efficient Inference

Use Cases

General Q&A
Trivia Q&A
Acts as a world-class trivia AI, providing accurate and concise answers.
Translation
Multilingual Translation
Supports translation tasks between Chinese and English.
Chemical Equations
Chemical Equation Generation
Generates chemical equations.
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase