Meta-Llama-3-8B-Instruct-GPTQ Open Source Large Language Model - Optimize Conversations with a Balance of Practicality and Security

Meta Llama 3 8B Instruct GPTQ

Developed by TechxGenus

An 8B parameter large language model developed by Meta, optimized for dialogue scenarios, balancing practicality and security

Large Language Model

Transformers

EnglishOpen Source License:Other #Dialogue optimization #Multi-turn interaction #Instruction fine-tuning

Downloads 3,491

Release Time : 4/19/2024

Model Overview

Llama 3 is a series of large language models developed by Meta, including 8B and 70B parameter versions. This instruction-tuned version is designed specifically for dialogue scenarios and performs excellently in industry benchmark tests while emphasizing security and practicality.

Model Features

Multiple version options

Two parameter scales of 8B and 70B are provided to meet different computing requirement scenarios

Dialogue optimization

After instruction fine-tuning and RLHF training, it performs better than most open-source chat models in dialogue scenarios

Security design

Special attention is paid to security and compliance during the development process, including detailed usage policy restrictions

Efficient training

Pre-trained with 15 trillion tokens of public data, and carbon emissions are offset by Meta's sustainable development plan

Model Capabilities

Text generation

Code generation

Dialogue interaction

Instruction following

Use Cases

Business assistant

Customer service chatbot

Build an English customer service system that can understand complex queries

Performs better than similar open-source models in industry benchmark tests

Research and development

NLP experiment platform

Used as a base model for natural language processing research

Provides fine-tunable 8B/70B parameter versions

🚀 Meta-Llama-3-8B-Instruct GPTQ Quantized Version

This is the GPTQ quantized version of the Meta-Llama-3-8B-Instruct model. It provides an efficient way to use the model with reduced memory requirements.

✨ Features

Powerful Language Generation: Meta developed and released the Meta Llama 3 family of large language models (LLMs), which are a collection of pretrained and instruction - tuned generative text models. The instruction - tuned models are optimized for assistant - like chat, while pretrained models can be adapted for various natural language generation tasks.
Multiple Sizes: Llama 3 comes in two sizes, 8B and 70B parameters, with both pre - trained and instruction - tuned variants.
Improved Inference: Both the 8B and 70B versions use Grouped - Query Attention (GQA) for improved inference scalability.

📦 Installation

This repository contains two versions of Meta - Llama - 3 - 8B - Instruct, for use with transformers and with the original llama3 codebase.

Use with transformers

import transformers
import torch

model_id = "meta-llama/Meta-Llama-3-8B-Instruct"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device="auto",
)

messages = [
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    {"role": "user", "content": "Who are you?"},
]

prompt = pipeline.tokenizer.apply_chat_template(
        messages, 
        tokenize=False, 
        add_generation_prompt=True
)

terminators = [
    pipeline.tokenizer.eos_token_id,
    pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = pipeline(
    prompt,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
print(outputs[0]["generated_text"][len(prompt):])

Use with `llama3`

Please follow the instructions in the [repository](https://github.com/meta - llama/llama3). To download Original checkpoints, see the example command below leveraging huggingface - cli:

huggingface-cli download meta-llama/Meta-Llama-3-8B-Instruct --include "original/*" --local-dir Meta-Llama-3-8B-Instruct

💻 Usage Examples

Basic Usage

The following example shows how to use the model with the transformers library for text generation:

import transformers
import torch

model_id = "meta-llama/Meta-Llama-3-8B-Instruct"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device="auto",
)

messages = [
    {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
    {"role": "user", "content": "Who are you?"},
]

prompt = pipeline.tokenizer.apply_chat_template(
        messages, 
        tokenize=False, 
        add_generation_prompt=True
)

terminators = [
    pipeline.tokenizer.eos_token_id,
    pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = pipeline(
    prompt,
    max_new_tokens=256,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
print(outputs[0]["generated_text"][len(prompt):])

📚 Documentation

Model Details

Property	Details
Model Type	Meta Llama 3 is an auto - regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine - tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
Training Data	A new mix of publicly available online data. The fine - tuning data includes publicly available instruction datasets, as well as over 10M human - annotated examples. Neither the pretraining nor the fine - tuning datasets include Meta user data.
Params	8B and 70B
Context length	8k
GQA	Yes
Token count	15T+
Knowledge cutoff	March 2023 (8B), December 2023 (70B)
Model Release Date	April 18, 2024
Status	This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback.
License	A custom commercial license is available at: https://llama.meta.com/llama3/license

Intended Use

Intended Use Cases Llama 3 is intended for commercial and research use in English. Instruction tuned models are intended for assistant - like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks.

Out - of - scope Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in any other way that is prohibited by the Acceptable Use Policy and Llama 3 Community License. Use in languages other than English.

⚠️ Important Note

Developers may fine - tune Llama 3 models for languages beyond English provided they comply with the Llama 3 Community License and the Acceptable Use Policy.

Hardware and Software

Training Factors We used custom training libraries, Meta's Research SuperCluster, and production clusters for pretraining. Fine - tuning, annotation, and evaluation were also performed on third - party cloud compute.

Property	Details
Time (GPU hours) - Llama 3 8B	1.3M
Time (GPU hours) - Llama 3 70B	6.4M
Time (GPU hours) - Total	7.7M
Power Consumption (W) - Llama 3 8B	700
Power Consumption (W) - Llama 3 70B	700
Carbon Emitted(tCO2eq) - Llama 3 8B	390
Carbon Emitted(tCO2eq) - Llama 3 70B	1900
Carbon Emitted(tCO2eq) - Total	2290

CO2 emissions during pre - training: Time refers to the total GPU time required for training each model. Power Consumption is the peak power capacity per GPU device for the GPUs used adjusted for power usage efficiency. 100% of the emissions are directly offset by Meta's sustainability program, and because we are openly releasing these models, the pretraining costs do not need to be incurred by others.

Training Data

Overview Llama 3 was pretrained on over 15 trillion tokens of data from publicly available sources. The fine - tuning data includes publicly available instruction datasets, as well as over 10M human - annotated examples. Neither the pretraining nor the fine - tuning datasets include Meta user data.

Data Freshness The pretraining data has a cutoff of March 2023 for the 8B and December 2023 for the 70B models respectively.

Benchmarks

In this section, we report the results for Llama 3 models on standard automatic benchmarks.

License

A custom commercial license is available at: https://llama.meta.com/llama3/license

The full license agreement can be found in the LICENSE file. The license agreement also includes the Meta Llama 3 Community License Agreement and the Meta Llama 3 Acceptable Use Policy.

Where to send questions or comments about the model: Instructions on how to provide feedback or comments on the model can be found in the model [README](https://github.com/meta - llama/llama3). For more technical information about generation parameters and recipes for how to use Llama 3 in applications, please go [here](https://github.com/meta - llama/llama - recipes).

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご