LLaMA-1B-dj-refine-150Bオープンソース大規模言語モデル - 同類を上回る性能で、強力な言語能力を無料で体験

ホーム

Llama 1B Dj Refine 150B

datajuicerによって開発

OpenLLaMAアーキテクチャに基づき、Data - Juicerで精錬されたRedPajamaとPileデータセットで事前学習された大規模言語モデルで、同じ13億パラメータ規模のモデルを上回る性能を持ちます。

大規模言語モデル

Transformers

オープンソースライセンス:Apache-2.0 #RedPajama - Pile精錬訓練 #13億パラメータの軽量かつ高効率 #HELMベンチマークでリード

ダウンロード数 2,834

リリース時間 : 10/30/2023

モデル概要

このモデルはData - Juicerが公開した参照レベルの大規模言語モデルで、LLaMA - 13億パラメータアーキテクチャを採用し、精錬されたデータセットで学習され、様々な自然言語処理タスクに適しています。

モデル特徴

高品質の学習データ

Data - Juicerで精錬されたRedPajamaとPileデータセットを使用し、データ品質は元のデータセットよりも優れています。

高効率な学習

たった1500億トークンで学習するだけで優れた性能を達成し、同類のモデルよりも学習効率が高いです。

優れた性能

16項目のHELMベンチマークテストで平均34.21点を獲得し、Falcon - 13億パラメータ、Pythia - 14億パラメータなどの同類のモデルを上回ります。

モデル能力

テキスト生成

言語理解

知識問答

テキスト要約

使用事例

研究応用

言語モデルのベンチマークテスト

異なる言語モデルの性能を評価および比較するために使用されます。

HELMベンチマークテストで優れた結果を示しました。

商用応用

インテリジェントカスタマーサービス

英語のインテリジェントカスタマーサービスシステムを構築するために使用されます。

🚀 Data-Juicerの参照LLM

このプロジェクトは、Data-Juicerに基づく参照用の大規模言語モデル（LLM）です。Data-Juicerの精製データを活用し、高い性能を発揮します。

🚀 クイックスタート

概要

このモデルは Data-Juicer からの参照LLMです。

モデルアーキテクチャはLLaMA-1.3Bで、OpenLLaMA の実装を採用しています。このモデルは、Data-Juicerで精製されたRedPajamaとPileの1500億トークンで事前学習されています。

16のHELMタスクで平均34.21点を達成し、Falcon-1.3B（RefinedWebの3500億トークンで学習）、Pythia-1.4B（オリジナルのPileの3000億トークンで学習）、Open-LLaMA-1.3B（オリジナルのRedPajamaとPileの1500億トークンで学習）を上回っています。

詳細については、論文を参照してください。

exp_llama

📦 データセット

Property	Details
データセット	datajuicer/redpajama-wiki-refined-by-data-juicer datajuicer/redpajama-arxiv-refined-by-data-juicer datajuicer/redpajama-c4-refined-by-data-juicer datajuicer/redpajama-book-refined-by-data-juicer datajuicer/redpajama-cc-2019-30-refined-by-data-juicer datajuicer/redpajama-cc-2020-05-refined-by-data-juicer datajuicer/redpajama-cc-2021-04-refined-by-data-juicer datajuicer/redpajama-cc-2022-05-refined-by-data-juicer datajuicer/redpajama-cc-2023-06-refined-by-data-juicer datajuicer/redpajama-pile-stackexchange-refined-by-data-juicer datajuicer/redpajama-stack-code-refined-by-data-juicer datajuicer/the-pile-nih-refined-by-data-juicer datajuicer/the-pile-europarl-refined-by-data-juicer datajuicer/the-pile-philpaper-refined-by-data-juicer datajuicer/the-pile-pubmed-abstracts-refined-by-data-juicer datajuicer/the-pile-pubmed-central-refined-by-data-juicer datajuicer/the-pile-freelaw-refined-by-data-juicer datajuicer/the-pile-hackernews-refined-by-data-juicer datajuicer/the-pile-uspto-refined-by-data-juicer

Property

Details

データセット

datajuicer/redpajama-wiki-refined-by-data-juicer
datajuicer/redpajama-arxiv-refined-by-data-juicer
datajuicer/redpajama-c4-refined-by-data-juicer
datajuicer/redpajama-book-refined-by-data-juicer
datajuicer/redpajama-cc-2019-30-refined-by-data-juicer
datajuicer/redpajama-cc-2020-05-refined-by-data-juicer
datajuicer/redpajama-cc-2021-04-refined-by-data-juicer
datajuicer/redpajama-cc-2022-05-refined-by-data-juicer
datajuicer/redpajama-cc-2023-06-refined-by-data-juicer
datajuicer/redpajama-pile-stackexchange-refined-by-data-juicer
datajuicer/redpajama-stack-code-refined-by-data-juicer
datajuicer/the-pile-nih-refined-by-data-juicer
datajuicer/the-pile-europarl-refined-by-data-juicer
datajuicer/the-pile-philpaper-refined-by-data-juicer
datajuicer/the-pile-pubmed-abstracts-refined-by-data-juicer
datajuicer/the-pile-pubmed-central-refined-by-data-juicer
datajuicer/the-pile-freelaw-refined-by-data-juicer
datajuicer/the-pile-hackernews-refined-by-data-juicer
datajuicer/the-pile-uspto-refined-by-data-juicer