extended-mind-mpt-7b开源模型 - 支持外部记忆库检索，扩展思维必备工具

首页

Extended Mind Mpt 7b

由 normalcomputing 开发

基于Mosaic ML架构调整的扩展思维Transformer模型，支持外部记忆库检索与关注功能

大型语言模型

Transformers

#外部记忆检索 #无微调知识增强 #动态记忆更新

下载量 111

发布时间 : 10/20/2023

模型简介

该模型实现了论文中描述的扩展思维方法，能够检索并关注外部键值对存储（记忆库），无需微调即可使用原始模型权重

模型特点

外部记忆库集成

支持通过token id序列传递外部记忆库，自动完成记忆生成与缓存

动态记忆更新

可通过clear_memories()方法和memory_ids属性实时更新记忆内容

引用检索功能

可输出生成过程中调用的具体记忆索引，增强结果可解释性

灵活配置

提供记忆类型选择、相似度屏蔽、特殊token处理等多种参数配置

模型能力

文本生成

外部记忆检索

上下文感知推理

多轮对话支持

使用案例

知识问答

基于外部知识的问答

通过注入维基百科等外部知识库，回答需要专业领域知识的问题

示例显示能准确回答关于数学家格罗滕迪克的入籍时间等具体问题

研究辅助

学术文献分析

通过注入论文摘要等学术内容，辅助进行文献综述和知识关联

🚀 Extended-Mind-Mpt-7b模型卡片

Extended-Mind-Mpt-7b模型属于Extended Mind Transformers集合的一部分，实现了我们在论文中描述的方法。该模型能够检索并关注外部的键值对缓存（即记忆），且未经过微调（原始模型权重未被修改）。

Github：https://github.com/normal-computing/extended-mind-transformers/
ArXiv：https://arxiv.org/abs/2406.02332

原架构和代码作者：Mosaic ML 开发者：Normal Computing，基于Mosacic ML进行适配 许可证：Apache 2.0

🚀 快速开始

本模型可按以下步骤使用，通过传入外部记忆，实现特定的文本生成功能。

✨ 主要特性

外部记忆支持：能够接收并利用外部记忆进行文本生成。
记忆检索与关注：在生成文本时自动检索并关注外部记忆。
可配置参数丰富：提供多个参数用于调整模型的行为。

📦 安装指南

文档未提及安装步骤，故跳过此章节。

💻 使用示例

基础用法

向模型传递外部记忆非常简单。只需在实例化模型时将令牌ID传递给模型即可，如以下示例所示。在首次调用model.generate()时，模型会在内部处理记忆的生成和缓存。你可以使用以下命令序列更新记忆：

model.clear_memories()
model.memory_ids = list_of_new_token_ids

设置trust_remote_code=True以避免警告。将记忆作为令牌ID列表传递给模型。

from transformers import AutoModelForCausalLM, AutoTokenizer

ag_wiki_entry = """Alexander Grothendieck (/ˈɡroʊtəndiːk/; German pronunciation: [ˌalɛˈksandɐ ˈɡʁoːtn̩ˌdiːk] (listen); French: [ɡʁɔtɛndik]; 28 March 1928 – 13 November 2014) was a stateless (and then, since 1971, French) mathematician who became the leading figure in the creation of modern algebraic geometry.[7][8] His research extended the scope of the field and added elements of commutative algebra, homological algebra, sheaf theory, and category theory to its foundations, while his so-called "relative" perspective led to revolutionary advances in many areas of pure mathematics.[7][9] He is considered by many to be the greatest mathematician of the twentieth century.[10][11]"""

tokenizer_hf = AutoTokenizer.from_pretrained("normalcomputing/extended-mind-mpt-7b")
memories = tokenizer_hf(ag_wiki_entry).input_ids

model_hf = AutoModelForCausalLM.from_pretrained("normalcomputing/extended-mind-mpt-7b", external_memories=memories, trust_remote_code=True)

之后，你可以像往常一样使用模型生成文本。模型在生成过程中会自动使用记忆。你可以通过向model.generate()方法传递新值来更新任何配置参数（如下例中设置了topk）：

inputs = "When did Alexander Grothendieck become a French citizen?"
inputs = tokenizer(inputs, return_tensors="pt").input_ids

outputs = model.generate(inputs, max_length=40, topk=2)
tokenizer.decode(outputs_hf['sequences'][0], skip_special_tokens=True)

高级用法

引用

通过在model.generate()方法中简单设置output_retrieved_memory_idx=True，你可以检索生成过程中使用的记忆索引。我们在演示笔记本中详细介绍了一个示例。

额外配置

LongLLaMA还有其他几个参数：

属性	详情
`memory_type`	字符串类型，可选，默认为`manual`，表示是手动存储外部记忆还是存储在向量数据库中。
`mask_by_sim`	布尔类型，可选，默认为`True`，表示是否根据相似度屏蔽检索到的记忆。
`sim_threshold`	浮点类型，可选，默认为`0.25`，是屏蔽检索到的记忆的阈值。
`tokenizer_all_special_ids`	列表类型，可选，默认为`[0, 50278]`，是要从记忆中移除的特殊令牌的ID。
`remove_special_tokens`	布尔类型，可选，默认为`True`，表示是否移除与分词器特殊ID对应的记忆。