LLaMA-1B-dj-refine-150B Open-Source Large Language Model - Outperforms Peers, Enjoy Free Experience of Powerful Language Capabilities

Llama 1B Dj Refine 150B

Developed by datajuicer

Based on the OpenLLaMA architecture, this large language model is pre-trained on Data-Juicer refined RedPajama and Pile datasets, outperforming other models of the same 1.3B parameter scale.

Large Language Model

Transformers

Open Source License:Apache-2.0 #RedPajama-Pile refined training #1.3B lightweight and efficient #Leading in HELM benchmarks

Downloads 2,834

Release Time : 10/30/2023

Model Overview

This model is a reference-level large language model released by Data-Juicer, adopting the LLaMA-1.3B architecture and trained on refined datasets, suitable for various natural language processing tasks.

Model Features

High-quality training data

Utilizes Data-Juicer refined RedPajama and Pile datasets, with superior data quality compared to the original datasets.

Efficient training

Achieves excellent performance with only 150 billion tokens trained, exhibiting higher training efficiency than similar models.

Superior performance

Scores an average of 34.21 on 16 HELM benchmark tests, surpassing comparable models like Falcon-1.3B and Pythia-1.4B.

Model Capabilities

Text generation

Language understanding

Knowledge Q&A

Text summarization

Use Cases

Research applications

Language model benchmarking

Used for evaluating and comparing the performance of different language models

Performs excellently in HELM benchmark tests

Commercial applications

Intelligent customer service

Used for building English intelligent customer service systems

🚀 Reference LLM from Data-Juicer

This is a reference large language model (LLM) from Data-Juicer, which offers high - performance based on refined datasets.

🚀 Quick Start

For more information about our first data - centric LLM competition, please visit the official websites of FT - Data Ranker (1B Track, 7B Track).

✨ Features

Model Architecture: The model uses the LLaMA - 1.3B architecture and adopts the OpenLLaMA implementation.
Training Data: It is pre - trained on 150B tokens of Data - Juicer's refined RedPajama and Pile datasets.
Performance: It achieves an average score of 34.21 over 16 HELM tasks, outperforming Falcon - 1.3B (trained on 350B tokens from RefinedWeb), Pythia - 1.4B (trained on 300B tokens from the original Pile), and Open - LLaMA - 1.3B (trained on 150B tokens from the original RedPajama and Pile).

📚 Documentation

For more details, please refer to our paper.

exp_llama

📄 License

The model is released under the Apache - 2.0 license.

📦 Training Datasets

Property	Details
Training Datasets	datajuicer/redpajama - wiki - refined - by - data - juicer, datajuicer/redpajama - arxiv - refined - by - data - juicer, datajuicer/redpajama - c4 - refined - by - data - juicer, datajuicer/redpajama - book - refined - by - data - juicer, datajuicer/redpajama - cc - 2019 - 30 - refined - by - data - juicer, datajuicer/redpajama - cc - 2020 - 05 - refined - by - data - juicer, datajuicer/redpajama - cc - 2021 - 04 - refined - by - data - juicer, datajuicer/redpajama - cc - 2022 - 05 - refined - by - data - juicer, datajuicer/redpajama - cc - 2023 - 06 - refined - by - data - juicer, datajuicer/redpajama - pile - stackexchange - refined - by - data - juicer, datajuicer/redpajama - stack - code - refined - by - data - juicer, datajuicer/the - pile - nih - refined - by - data - juicer, datajuicer/the - pile - europarl - refined - by - data - juicer, datajuicer/the - pile - philpaper - refined - by - data - juicer, datajuicer/the - pile - pubmed - abstracts - refined - by - data - juicer, datajuicer/the - pile - pubmed - central - refined - by - data - juicer, datajuicer/the - pile - freelaw - refined - by - data - juicer, datajuicer/the - pile - hackernews - refined - by - data - juicer, datajuicer/the - pile - uspto - refined - by - data - juicer

Property

Details

Training Datasets

datajuicer/redpajama - wiki - refined - by - data - juicer, datajuicer/redpajama - arxiv - refined - by - data - juicer, datajuicer/redpajama - c4 - refined - by - data - juicer, datajuicer/redpajama - book - refined - by - data - juicer, datajuicer/redpajama - cc - 2019 - 30 - refined - by - data - juicer, datajuicer/redpajama - cc - 2020 - 05 - refined - by - data - juicer, datajuicer/redpajama - cc - 2021 - 04 - refined - by - data - juicer, datajuicer/redpajama - cc - 2022 - 05 - refined - by - data - juicer, datajuicer/redpajama - cc - 2023 - 06 - refined - by - data - juicer, datajuicer/redpajama - pile - stackexchange - refined - by - data - juicer, datajuicer/redpajama - stack - code - refined - by - data - juicer, datajuicer/the - pile - nih - refined - by - data - juicer, datajuicer/the - pile - europarl - refined - by - data - juicer, datajuicer/the - pile - philpaper - refined - by - data - juicer, datajuicer/the - pile - pubmed - abstracts - refined - by - data - juicer, datajuicer/the - pile - pubmed - central - refined - by - data - juicer, datajuicer/the - pile - freelaw - refined - by - data - juicer, datajuicer/the - pile - hackernews - refined - by - data - juicer, datajuicer/the - pile - uspto - refined - by - data - juicer

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご