LLaMA-1B-dj-refine-150B開源大語言模型 - 性能超同類，免費體驗強大語言能力

首頁

Llama 1B Dj Refine 150B

由datajuicer開發

基於OpenLLaMA架構，在Data-Juicer精煉的RedPajama和Pile數據集上預訓練的大語言模型，性能超越同類1.3B參數規模模型。

大型語言模型

Transformers

開源協議:Apache-2.0 #RedPajama-Pile精煉訓練 #1.3B輕量高效 #HELM基準領先

下載量 2,834

發布時間 : 10/30/2023

模型概述

本模型是Data-Juicer發佈的參考級大語言模型，採用LLaMA-1.3B架構，在精煉數據集上訓練，適用於多種自然語言處理任務。

模型特點

高質量訓練數據

使用Data-Juicer精煉的RedPajama和Pile數據集，數據質量優於原始數據集

高效訓練

僅用1500億token訓練即達到優異性能，訓練效率高於同類模型

性能優越

在16項HELM基準測試中平均得分34.21，超越Falcon-1.3B、Pythia-1.4B等同類模型

模型能力

文本生成

語言理解

知識問答

文本摘要

使用案例

研究應用

語言模型基準測試

用於評估和比較不同語言模型的性能

在HELM基準測試中表現優異

商業應用

智能客服

用於構建英語智能客服系統

🚀 參考大語言模型項目

本項目是一個參考大語言模型，源自 Data-Juicer，能在多個任務中展現出優秀性能，為相關領域的研究和應用提供了有力支持。

🚀 快速開始

可直接訪問競賽官網獲取更多信息，我們的首個以數據為中心的大語言模型競賽已經開啟！請訪問 FT-Data Ranker 的官網查看詳情：1B賽道、7B賽道。

✨ 主要特性

模型架構：採用 LLaMA - 1.3B 架構，並使用 OpenLLaMA 的實現方式。
預訓練數據：在經過 Data - Juicer 優化處理的 1500 億個 RedPajama 和 Pile 數據標記上進行預訓練。
性能表現：在 16 項 HELM 任務中平均得分達到 34.21，超越了 Falcon - 1.3B（在來自 RefinedWeb 的 3500 億個標記上訓練）、Pythia - 1.4B（在來自原始 Pile 的 3000 億個標記上訓練）和 Open - LLaMA - 1.3B（在來自原始 RedPajama 和 Pile 的 1500 億個標記上訓練）。

📚 詳細文檔

更多詳細信息，請參考我們的論文。

exp_llama

📄 許可證

本項目採用 Apache - 2.0 許可證。

📦 訓練數據集

屬性	詳情
訓練數據	datajuicer/redpajama - wiki - refined - by - data - juicer、datajuicer/redpajama - arxiv - refined - by - data - juicer、datajuicer/redpajama - c4 - refined - by - data - juicer、datajuicer/redpajama - book - refined - by - data - juicer、datajuicer/redpajama - cc - 2019 - 30 - refined - by - data - juicer、datajuicer/redpajama - cc - 2020 - 05 - refined - by - data - juicer、datajuicer/redpajama - cc - 2021 - 04 - refined - by - data - juicer、datajuicer/redpajama - cc - 2022 - 05 - refined - by - data - juicer、datajuicer/redpajama - cc - 2023 - 06 - refined - by - data - juicer、datajuicer/redpajama - pile - stackexchange - refined - by - data - juicer、datajuicer/redpajama - stack - code - refined - by - data - juicer、datajuicer/the - pile - nih - refined - by - data - juicer、datajuicer/the - pile - europarl - refined - by - data - juicer、datajuicer/the - pile - philpaper - refined - by - data - juicer、datajuicer/the - pile - pubmed - abstracts - refined - by - data - juicer、datajuicer/the - pile - pubmed - central - refined - by - data - juicer、datajuicer/the - pile - freelaw - refined - by - data - juicer、datajuicer/the - pile - hackernews - refined - by - data - juicer、datajuicer/the - pile - uspto - refined - by - data - juicer

屬性

詳情

訓練數據

datajuicer/redpajama - wiki - refined - by - data - juicer、datajuicer/redpajama - arxiv - refined - by - data - juicer、datajuicer/redpajama - c4 - refined - by - data - juicer、datajuicer/redpajama - book - refined - by - data - juicer、datajuicer/redpajama - cc - 2019 - 30 - refined - by - data - juicer、datajuicer/redpajama - cc - 2020 - 05 - refined - by - data - juicer、datajuicer/redpajama - cc - 2021 - 04 - refined - by - data - juicer、datajuicer/redpajama - cc - 2022 - 05 - refined - by - data - juicer、datajuicer/redpajama - cc - 2023 - 06 - refined - by - data - juicer、datajuicer/redpajama - pile - stackexchange - refined - by - data - juicer、datajuicer/redpajama - stack - code - refined - by - data - juicer、datajuicer/the - pile - nih - refined - by - data - juicer、datajuicer/the - pile - europarl - refined - by - data - juicer、datajuicer/the - pile - philpaper - refined - by - data - juicer、datajuicer/the - pile - pubmed - abstracts - refined - by - data - juicer、datajuicer/the - pile - pubmed - central - refined - by - data - juicer、datajuicer/the - pile - freelaw - refined - by - data - juicer、datajuicer/the - pile - hackernews - refined - by - data - juicer、datajuicer/the - pile - uspto - refined - by - data - juicer