LLaMA-1B-dj-refine-150B开源大语言模型 - 性能超同类，免费体验强大语言能力

首页

Llama 1B Dj Refine 150B

由 datajuicer 开发

基于OpenLLaMA架构，在Data-Juicer精炼的RedPajama和Pile数据集上预训练的大语言模型，性能超越同类1.3B参数规模模型。

大型语言模型

Transformers

开源协议:Apache-2.0 #RedPajama-Pile精炼训练 #1.3B轻量高效 #HELM基准领先

下载量 2,834

发布时间 : 10/30/2023

模型简介

本模型是Data-Juicer发布的参考级大语言模型，采用LLaMA-1.3B架构，在精炼数据集上训练，适用于多种自然语言处理任务。

模型特点

高质量训练数据

使用Data-Juicer精炼的RedPajama和Pile数据集，数据质量优于原始数据集

高效训练

仅用1500亿token训练即达到优异性能，训练效率高于同类模型

性能优越

在16项HELM基准测试中平均得分34.21，超越Falcon-1.3B、Pythia-1.4B等同类模型

模型能力

文本生成

语言理解

知识问答

文本摘要

使用案例

研究应用

语言模型基准测试

用于评估和比较不同语言模型的性能

在HELM基准测试中表现优异

商业应用

智能客服

用于构建英语智能客服系统

🚀 参考大语言模型项目

本项目是一个参考大语言模型，源自 Data-Juicer，能在多个任务中展现出优秀性能，为相关领域的研究和应用提供了有力支持。

🚀 快速开始

可直接访问竞赛官网获取更多信息，我们的首个以数据为中心的大语言模型竞赛已经开启！请访问 FT-Data Ranker 的官网查看详情：1B赛道、7B赛道。

✨ 主要特性

模型架构：采用 LLaMA - 1.3B 架构，并使用 OpenLLaMA 的实现方式。
预训练数据：在经过 Data - Juicer 优化处理的 1500 亿个 RedPajama 和 Pile 数据标记上进行预训练。
性能表现：在 16 项 HELM 任务中平均得分达到 34.21，超越了 Falcon - 1.3B（在来自 RefinedWeb 的 3500 亿个标记上训练）、Pythia - 1.4B（在来自原始 Pile 的 3000 亿个标记上训练）和 Open - LLaMA - 1.3B（在来自原始 RedPajama 和 Pile 的 1500 亿个标记上训练）。

📚 详细文档

更多详细信息，请参考我们的论文。

exp_llama

📄 许可证

本项目采用 Apache - 2.0 许可证。

📦 训练数据集

属性	详情
训练数据	datajuicer/redpajama - wiki - refined - by - data - juicer、datajuicer/redpajama - arxiv - refined - by - data - juicer、datajuicer/redpajama - c4 - refined - by - data - juicer、datajuicer/redpajama - book - refined - by - data - juicer、datajuicer/redpajama - cc - 2019 - 30 - refined - by - data - juicer、datajuicer/redpajama - cc - 2020 - 05 - refined - by - data - juicer、datajuicer/redpajama - cc - 2021 - 04 - refined - by - data - juicer、datajuicer/redpajama - cc - 2022 - 05 - refined - by - data - juicer、datajuicer/redpajama - cc - 2023 - 06 - refined - by - data - juicer、datajuicer/redpajama - pile - stackexchange - refined - by - data - juicer、datajuicer/redpajama - stack - code - refined - by - data - juicer、datajuicer/the - pile - nih - refined - by - data - juicer、datajuicer/the - pile - europarl - refined - by - data - juicer、datajuicer/the - pile - philpaper - refined - by - data - juicer、datajuicer/the - pile - pubmed - abstracts - refined - by - data - juicer、datajuicer/the - pile - pubmed - central - refined - by - data - juicer、datajuicer/the - pile - freelaw - refined - by - data - juicer、datajuicer/the - pile - hackernews - refined - by - data - juicer、datajuicer/the - pile - uspto - refined - by - data - juicer

属性

详情

训练数据

datajuicer/redpajama - wiki - refined - by - data - juicer、datajuicer/redpajama - arxiv - refined - by - data - juicer、datajuicer/redpajama - c4 - refined - by - data - juicer、datajuicer/redpajama - book - refined - by - data - juicer、datajuicer/redpajama - cc - 2019 - 30 - refined - by - data - juicer、datajuicer/redpajama - cc - 2020 - 05 - refined - by - data - juicer、datajuicer/redpajama - cc - 2021 - 04 - refined - by - data - juicer、datajuicer/redpajama - cc - 2022 - 05 - refined - by - data - juicer、datajuicer/redpajama - cc - 2023 - 06 - refined - by - data - juicer、datajuicer/redpajama - pile - stackexchange - refined - by - data - juicer、datajuicer/redpajama - stack - code - refined - by - data - juicer、datajuicer/the - pile - nih - refined - by - data - juicer、datajuicer/the - pile - europarl - refined - by - data - juicer、datajuicer/the - pile - philpaper - refined - by - data - juicer、datajuicer/the - pile - pubmed - abstracts - refined - by - data - juicer、datajuicer/the - pile - pubmed - central - refined - by - data - juicer、datajuicer/the - pile - freelaw - refined - by - data - juicer、datajuicer/the - pile - hackernews - refined - by - data - juicer、datajuicer/the - pile - uspto - refined - by - data - juicer