Goku 8x22B V0.1
基于Mixtral-8x22B-v0.1微调的多语言大模型,总参数量1410亿,激活参数350亿
下载量 35
发布时间 : 4/12/2024
模型简介
这是一个基于guanaco-sharegpt-style数据集微调的混合专家模型,支持多语言文本生成任务
模型特点
混合专家架构
采用8个专家模型组合,每次推理仅激活部分专家,实现高效计算
多语言支持
原生支持法语、意大利语、德语、西班牙语和英语
指令微调
基于guanaco-sharegpt-style数据集优化,增强对话和指令跟随能力
模型能力
多语言文本生成
长文本理解
编程代码生成
基础推理
故事创作
使用案例
内容创作
故事生成
生成连贯的长篇叙事文本
如示例中展示的龙珠主题故事
技术应用
代码辅助
生成和解释编程代码
🚀 Goku-8x22B-v0.1 (Goku 141b - A35b)
Goku-8x22B-v0.1是基于philschmid/guanaco-sharegpt-style
数据集对v2ray/Mixtral-8x22B-v0.1模型进行微调后的版本。该模型共有1410亿个参数,其中只有350亿个处于激活状态。

🚀 快速开始
使用pipeline进行高级辅助
from transformers import pipeline
pipe = pipeline("text-generation", model="MaziyarPanahi/Goku-8x22B-v0.1")
直接加载模型
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("MaziyarPanahi/Goku-8x22B-v0.1")
model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/Goku-8x22B-v0.1")
通过适配器加载
如果你已经下载了以下模型之一:v2ray/Mixtral-8x22B-v0.1 或 mistral-community/Mixtral-8x22B-v0.1(它们是相同的),你也可以使用PEFT仅加载适配器。
# 假设你已经下载了
# 调整词表大小
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id="v2ray/Mixtral-8x22B-v0.1"
peft_model_id = "~/.cache/huggingface/hub/models--MaziyarPanahi--Goku-8x22B-v0.1/adapter"
tokenizer = AutoTokenizer.from_pretrained(peft_model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
# 我为ChatML模板添加了2个新标记
# 如果你使用PEFT/适配器,此步骤是必需的
model.resize_token_embeddings(len(tokenizer))
model.load_adapter(peft_model_id)
# 你甚至可以使用TextStreamer和文本生成pipeline与你的适配器一起使用
streamer = TextStreamer(tokenizer)
pipe = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
max_new_tokens=750,
temperature=0.6,
do_sample=True,
top_k=50,
top_p=0.95,
repetition_penalty=1.1,
return_full_text=False,
add_special_tokens=False,
streamer=streamer
)
💻 使用示例
基础用法
Goku-8x22B-v0.1
已在文本生成、基于长上下文回答问题、编码和一些推理任务中进行了测试。在下一个版本中,我将使用更多与数学
和编码
相关的数据集。
以下是由MaziyarPanahi/Goku-8x22B-v0.1生成的一个示例故事:
Goku had heard a commotion from his house but when he went to check he saw nothing. He thought to himself, "I'll let it go, it was probably just a bird or something. I'm sure it will be fine." But that was when he heard the commotion again, so he went outside and this time he saw two figures on the horizon. One of the figures was a giant pinkish-purple creature, while the other was small, pink, ball-shaped thing.
As the figures approached, Goku realized the large creature was his former enemy, the powerful Majin Buu. And the smaller creature was Kirby, a powerful Star Warrior from the planet Popstar. Goku couldn't believe his eyes.
The two creatures approached Goku menacingly. "Kirby and I have teamed up," said Majin Buu. "We're going to destroy the world!"
Goku was taken aback by the statement. He had never considered the possibility of these two powerful creatures joining forces. He knew he had to put a stop to them, before they could cause any more damage.
He took a deep breath and faced the two creatures. "You two won't get away with this," Goku said firmly. "I won't let you destroy the world."
Majin Buu scoffed, "You can't stop us! Kirby and I are too powerful!"
Goku quickly formed an energy ball in his hands and faced the two creatures. "We'll see about that," he said.
The battle that ensued was intense. The two creatures worked together, using their powerful energy attacks to try to overcome Goku. But Goku kept fighting, using his own powerful energy attacks to counter their moves.
After what seemed like an eternity, Goku managed to get the upper hand. He used a powerful energy attack to defeat the two creatures. After they were defeated, Goku looked around and saw the damage that had been caused by the battle. He knew he still had a lot of work ahead of him in order to prevent any further destruction, but he was determined to do his best.
He summoned all of his power and focused it into a powerful energy attack. The energy spread throughout his body and he felt his power grow stronger. With a battle cry, he launched the attack at the two creatures.
The energy hit them both, sending them flying back, stunned for a moment. Goku continued to pressure them with his energy attacks, but they soon recovered and began to counter-attack with their own energy blasts.
Goku knew he had to act quickly if he was going to defeat them. He focused his energy into one powerful attack, and launched it at Kirby. The attack hit and the Star Warrior was sent flying away.
Goku then focused his attention on Majin Buu. He launched a series of energy attacks, using his signature technique, the Kamehameha, and managed to defeat the powerful creature.
After the battle, Goku looked around at the destruction that had been caused by the two creatures. He knew he still had a lot of work ahead of him in order to prevent any further destruction, but he was determined to do his best.
With the two creatures defeated, Goku knew he still had a job to do. He took a deep breath and set out to repair the damage that had been caused by the two powerful creatures. He worked for hours, using his energy to put everything back in order and ensuring that the world was safe once again.
Goku's journey was long and hard but, in the end, he was successful. He defeated two powerful enemies and saved the world from destroyers. Thanks to his hard work, the world was able to heal and once again become a place of peace and prosperity.
📄 许可证
本项目采用Apache-2.0许可证。
Phi 2 GGUF
其他
Phi-2是微软开发的一个小型但强大的语言模型,具有27亿参数,专注于高效推理和高质量文本生成。
大型语言模型 支持多种语言
P
TheBloke
41.5M
205
Roberta Large
MIT
基于掩码语言建模目标预训练的大型英语语言模型,采用改进的BERT训练方法
大型语言模型 英语
R
FacebookAI
19.4M
212
Distilbert Base Uncased
Apache-2.0
DistilBERT是BERT基础模型的蒸馏版本,在保持相近性能的同时更轻量高效,适用于序列分类、标记分类等自然语言处理任务。
大型语言模型 英语
D
distilbert
11.1M
669
Llama 3.1 8B Instruct GGUF
Meta Llama 3.1 8B Instruct 是一个多语言大语言模型,针对多语言对话用例进行了优化,在常见的行业基准测试中表现优异。
大型语言模型 英语
L
modularai
9.7M
4
Xlm Roberta Base
MIT
XLM-RoBERTa是基于100种语言的2.5TB过滤CommonCrawl数据预训练的多语言模型,采用掩码语言建模目标进行训练。
大型语言模型 支持多种语言
X
FacebookAI
9.6M
664
Roberta Base
MIT
基于Transformer架构的英语预训练模型,通过掩码语言建模目标在海量文本上训练,支持文本特征提取和下游任务微调
大型语言模型 英语
R
FacebookAI
9.3M
488
Opt 125m
其他
OPT是由Meta AI发布的开放预训练Transformer语言模型套件,参数量从1.25亿到1750亿,旨在对标GPT-3系列性能,同时促进大规模语言模型的开放研究。
大型语言模型 英语
O
facebook
6.3M
198
1
基于transformers库的预训练模型,适用于多种NLP任务
大型语言模型
Transformers

1
unslothai
6.2M
1
Llama 3.1 8B Instruct
Llama 3.1是Meta推出的多语言大语言模型系列,包含8B、70B和405B参数规模,支持8种语言和代码生成,优化了多语言对话场景。
大型语言模型
Transformers 支持多种语言

L
meta-llama
5.7M
3,898
T5 Base
Apache-2.0
T5基础版是由Google开发的文本到文本转换Transformer模型,参数规模2.2亿,支持多语言NLP任务。
大型语言模型 支持多种语言
T
google-t5
5.4M
702
精选推荐AI模型
Llama 3 Typhoon V1.5x 8b Instruct
专为泰语设计的80亿参数指令模型,性能媲美GPT-3.5-turbo,优化了应用场景、检索增强生成、受限生成和推理任务
大型语言模型
Transformers 支持多种语言

L
scb10x
3,269
16
Cadet Tiny
Openrail
Cadet-Tiny是一个基于SODA数据集训练的超小型对话模型,专为边缘设备推理设计,体积仅为Cosmo-3B模型的2%左右。
对话系统
Transformers 英语

C
ToddGoldfarb
2,691
6
Roberta Base Chinese Extractive Qa
基于RoBERTa架构的中文抽取式问答模型,适用于从给定文本中提取答案的任务。
问答系统 中文
R
uer
2,694
98