RWKV-5 World 7B開源大語言模型 - 免費部署助力中文文本高效生成

首頁

Rwkv 5 World 7b

由SmerkyG開發

RWKV-5 Eagle 7B是基於RWKV架構的7B參數規模大語言模型，支持中文文本生成任務

大型語言模型

Transformers

開源協議:Apache-2.0 #長文本生成 #零樣本推理 #中文對話

下載量 19

發布時間 : 3/22/2024

模型概述

該模型是基於RWKV架構的非指令調優版本語言模型，適用於通用文本生成任務，可通過HuggingFace Transformers庫調用

模型特點

高效推理架構

採用RWKV-5架構，相比傳統Transformer具有更高的推理效率

中文優化

針對中文文本生成任務進行專門優化

多精度支持

支持FP32和FP16精度，適應不同硬件環境

模型能力

中文文本生成

問答系統

知識問答

旅遊景點介紹

動物知識解答

使用案例

信息查詢

旅遊景點介紹

生成關於特定城市或景點的詳細介紹

如北京旅遊景點介紹，包含故宮、長城等詳細信息

動物知識解答

回答關於特定動物的特徵和習性

如大熊貓的食性、保護現狀等專業信息

對話系統

智能助手

構建專業詳盡的智能問答助手

示例對話顯示模型能保持上下文連貫性

🚀 Huggingface RWKV - 5 Eagle 7B模型

本項目基於Huggingface Transformers庫實現了RWKV - 5 Eagle 7B模型。該模型能為用戶提供文本生成等功能，可在CPU、GPU上運行，還支持批量推理。

🚀 快速開始

重要提示

⚠️ 重要提示

以下是基於Huggingface Transformers實現的RWKV - 5 Eagle 7B模型，僅適用於Huggingface Transformers庫。若需獲取完整模型權重以用於其他RWKV庫，請參考此處。此模型並非指令微調模型（後續會有更新）。

模型運行示例

💻 使用示例

基礎用法

在CPU上運行

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

def generate_prompt(instruction, input=""):
    instruction = instruction.strip().replace('\r\n','\n').replace('\n\n','\n')
    input = input.strip().replace('\r\n','\n').replace('\n\n','\n')
    if input:
        return f"""Instruction: {instruction}

Input: {input}

Response:"""
    else:
        return f"""User: hi

Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.

User: {instruction}

Assistant:"""


model = AutoModelForCausalLM.from_pretrained("RWKV/HF_v5-Eagle-7B", trust_remote_code=True).to(torch.float32)
tokenizer = AutoTokenizer.from_pretrained("RWKV/HF_v5-Eagle-7B", trust_remote_code=True)

text = "請介紹北京的旅遊景點"
prompt = generate_prompt(text)

inputs = tokenizer(prompt, return_tensors="pt")
output = model.generate(inputs["input_ids"], max_new_tokens=333, do_sample=True, temperature=1.0, top_p=0.3, top_k=0, )
print(tokenizer.decode(output[0].tolist(), skip_special_tokens=True))

輸出結果：

User: hi

Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.

User: 請介紹北京的旅遊景點

Assistant: 北京是中國的首都，擁有眾多的旅遊景點，以下是其中一些著名的景點：
1. 故宮：位於北京市中心，是明清兩代的皇宮，內有大量的文物和藝術品。
2. 天安門廣場：是中國最著名的廣場之一，是中國人民政治協商會議的舊址，也是中國人民政治協商會議的中心。
3. 頤和園：是中國古代皇家園林之一，有著悠久的歷史和豐富的文化內涵。
4. 長城：是中國古代的一道長城，全長約萬里，是中國最著名的旅遊景點之一。
5. 北京大學：是中國著名的高等教育機構之一，有著悠久的歷史和豐富的文化內涵。
6. 北京動物園：是中國最大的動物園之一，有著豐富的動物資源和豐富的文化內涵。
7. 故宮博物院：是中國最著名的博物館之一，收藏了大量的文物和藝術品，是中國最重要的文化遺產之一。
8. 天壇：是中國古代皇家

在GPU上運行

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

def generate_prompt(instruction, input=""):
    instruction = instruction.strip().replace('\r\n','\n').replace('\n\n','\n')
    input = input.strip().replace('\r\n','\n').replace('\n\n','\n')
    if input:
        return f"""Instruction: {instruction}

Input: {input}

Response:"""
    else:
        return f"""User: hi

Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.

User: {instruction}

Assistant:"""


model = AutoModelForCausalLM.from_pretrained("RWKV/HF_v5-Eagle-7B", trust_remote_code=True, torch_dtype=torch.float16).to(0)
tokenizer = AutoTokenizer.from_pretrained("RWKV/HF_v5-Eagle-7B", trust_remote_code=True)

text = "介紹一下大熊貓"
prompt = generate_prompt(text)

inputs = tokenizer(prompt, return_tensors="pt").to(0)
output = model.generate(inputs["input_ids"], max_new_tokens=128, do_sample=True, temperature=1.0, top_p=0.3, top_k=0, )
print(tokenizer.decode(output[0].tolist(), skip_special_tokens=True))

輸出結果：

User: hi

Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.

User: 介紹一下大熊貓

Assistant: 大熊貓是一種中國特有的哺乳動物，也是中國的國寶之一。它們的外貌特徵是圓形的黑白相間的身體，有著黑色的毛髮和白色的耳朵。大熊貓的食物主要是竹子，它們會在竹林中尋找竹子，並且會將竹子放在竹籠中進行儲存。大熊貓的壽命約為20至30年，但由於棲息地的喪失和人類活動的

高級用法

批量推理

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

def generate_prompt(instruction, input=""):
    instruction = instruction.strip().replace('\r\n', '\n').replace('\n\n', '\n')
    input = input.strip().replace('\r\n', '\n').replace('\n\n', '\n')
    if input:
        return f"""Instruction: {instruction}

Input: {input}

Response:"""
    else:
        return f"""User: hi

Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.

User: {instruction}

Assistant:"""

model = AutoModelForCausalLM.from_pretrained("RWKV/HF_v5-Eagle-7B", trust_remote_code=True).to(torch.float32)
tokenizer = AutoTokenizer.from_pretrained("RWKV/HF_v5-Eagle-7B", trust_remote_code=True)

texts = ["請介紹北京的旅遊景點", "介紹一下大熊貓", "烏蘭察布"]
prompts = [generate_prompt(text) for text in texts]

inputs = tokenizer(prompts, return_tensors="pt", padding=True)
outputs = model.generate(inputs["input_ids"], max_new_tokens=128, do_sample=True, temperature=1.0, top_p=0.3, top_k=0, )

for output in outputs:
    print(tokenizer.decode(output.tolist(), skip_special_tokens=True))

輸出結果：

User: hi

Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.

User: 請介紹北京的旅遊景點

Assistant: 北京是中國的首都，擁有豐富的旅遊資源和歷史文化遺產。以下是一些北京的旅遊景點：
1. 故宮：位於北京市中心，是明清兩代的皇宮，是中國最大的古代宮殿建築群之一。
2. 天安門廣場：位於北京市中心，是中國最著名的城市廣場之一，也是中國最大的城市廣場。
3. 頤和
User: hi

Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.

User: 介紹一下大熊貓

Assistant: 大熊貓是一種生活在中國中部地區的哺乳動物，也是中國的國寶之一。它們的外貌特徵是圓形的黑白相間的身體，有著黑色的毛髮和圓圓的眼睛。大熊貓是一種瀕危物種，目前只有在野外的幾個保護區才能看到它們的身影。大熊貓的食物主要是竹子，它們會在竹子上尋找食物，並且可以通
User: hi

Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.

User: 烏蘭察布

Assistant: 烏蘭察布是中國新疆維吾爾自治區的一個縣級市，位於新疆維吾爾自治區中部，是新疆的第二大城市。烏蘭察布市是新疆的第一大城市，也是新疆的重要城市之一。烏蘭察布市是新疆的經濟中心，也是新疆的重要交通樞紐之一。烏蘭察布市的人口約為2.5萬人，其中漢族佔絕大多數。烏