Llama 3.1 NemoGuard 8B Open Source Dialogue Model - Supports Theme Control and Task-Oriented Dialogue Review

Llama 3.1 Nemoguard 8b Topic Control

Developed by nvidia

Llama-3.1-NemoGuard-8B-Topic-Control is a model designed for dialogue topic control, specifically tailored for task-oriented dialogue agents and custom policy-based moderation.

Large Language Model

Safetensors

EnglishOpen Source License:Other #Dialogue Topic Review #Task-Oriented Dialogue #LoRa Fine-Tuning

Downloads 965

Release Time : 1/15/2025

Model Overview

This model is used to review user prompts for topic and dialogue content during human-assistant interactions, ensuring compliance with topic rules defined in system instructions.

Model Features

Topic Control

Capable of reviewing user prompts based on system instructions to ensure dialogue content stays within specified topic boundaries.

Parameter-Efficient Fine-Tuning

Utilizes LoRa fine-tuning technology to achieve efficient training while maintaining the base model's performance.

Commercial Use Support

This model is ready for commercial use and is governed by the NVIDIA Open Model License.

Model Capabilities

Topic Review

Dialogue Content Filtering

Text Classification

Use Cases

Customer Support

Banking Assistant

Ensures user queries are limited to banking-related topics, blocking unrelated topics such as travel or cryptocurrency.

Improves dialogue relevance and efficiency

Content Moderation

Dialogue Safety Review

Reviews user input to ensure compliance with predefined content safety policies.

Reduces the risk of inappropriate content

🚀 Llama-3.1-NemoGuard-8B-Topic-Control

Llama-3.1-NemoGuard-8B-Topic-Control is a model designed for topical and dialogue moderation in human-assistant interactions. It can be used in task-oriented dialogue agents and custom policy-based moderation, ensuring user prompts align with specified rules. This model is ready for commercial use.

🚀 Quick Start

You can try out the model here: Llama-3.1-NemoGuard-8B-Topic-Control

✨ Features

Input Moderation: Ensures user prompts are consistent with system prompt rules.
Customizable Rules: Allows for specifying allowed and disallowed topics, personas, and conversation boundaries.
Binary Output: Returns a clear "on-topic" or "off-topic" response.
Commercial Use: Ready for commercial applications.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

The prompt template consists of two key sections: system instruction and conversation history.

System Instruction

The system instruction part of the prompt serves as a comprehensive guideline to steer the conversation. It includes core rules and persona assignment.

If any of the above conditions are violated, please respond with "off-topic". Otherwise, respond with "on-topic". You must respond with "on-topic" or "off-topic".

Conversation History

The conversation history maintains a sequential record of user prompts and LLM responses.

[
   {
       "role": "system",
       "content": "In the next conversation always use a polite tone and do not engage in any talk about travelling and touristic destinations"
   },
   {
       "role": "user",
       "content": "Hi there!"
   },
   {
       "role": "assistant",
       "content": "Hello! How can I help today?"
   },
   {
       "role": "user",
       "content": "Do you know which is the most popular beach in Barcelona?"
   }
]

Advanced Usage

Integrating with NeMo Guardrails: To integrate the topic control model with NeMo Guardrails, create a config.yml file similar to the following example:

models:
  - type: main
    engine: openai
    model: gpt-3.5-turbo-instruct
  - type: "topic_control"
    engine: nim
    parameters:
      base_url: "http://localhost:8000/v1"
      model_name: "llama-3.1-nemoguard-8b-topic-control"
rails:
  input:
    flows:
      - topic safety check input $model=topic_control

📚 Documentation

Model Overview

The base large language model (LLM) is the multilingual Llama-3.1-8B-Instruct model from Meta. Llama-3.1-TopicGuard is LoRa-tuned on a topic-following dataset generated synthetically with Mixtral-8x7B-Instruct-v0.1.

License/Terms of Use

Use of this model is governed by the NVIDIA Open Model License Agreement. Additional Information: Llama 3.1 Community License Agreement. Built with Llama.

Reference(s)

Model Architecture

Architecture Type: Transformer
Network Architecture: Based on the Llama-3.1-8B-Instruct model from Meta (Model Card). Parameter Efficient FineTuning (PEFT) is performed with the following parameters:
- Rank: 8
- Alpha: 32
- Targeted low rank adaptation modules: 'k_proj', 'q_proj', 'v_proj', 'o_proj', 'up_proj', 'down_proj', 'gate_proj'.
Training Method: Involves using a system instruction, a synthetic generated dataset, and instruction-tuning the base model to detect on-topic or off-topic user messages.

Input

Input Type(s): Text
Input Format(s): String
Input Parameters: 1D (One-Dimensional) List: System prompt with topical instructions, followed by a conversation structured as a list of user and assistant messages.
Other Properties Related to Input: The conversation should end with a user message for topical moderation. The input format respects the (OpenAI Chat specification)[https://platform.openai.com/docs/guides/text-generation].

Output

Output Type(s): Text
Output Format: String
Output Parameters: 1D (One-Dimensional)
Other Properties Related to Output: The response is a binary string label determining if the last user turn in the input conversation respects the topical instruction. The label options are either "on-topic" or "off-topic".

Software Integration

Runtime Engine(s): PyTorch
Libraries: Meta's llama-recipes, HuggingFace transformers library, HuggingFace peft library
Supported Hardware Platform(s): NVIDIA Ampere (A100 80GB, A100 40GB)
Preferred/Supported Operating System(s): Linux (Ubuntu)

Model Version(s)

Llama-3.1-TopicGuard

Training, Testing, and Evaluation Datasets

Training Dataset

Link: CantTalkABoutThis dataset
Data Collection Method by dataset: Synthetic
Labeling Method by dataset: Synthetic
Properties: Contains 1080 multi-turn conversations that are on-topic using 540 different topical instructions from various domains. For each on-topic conversation, off-topic/distractor turns are generated.

Testing Dataset

Link: CantTalkABoutThis topic-following dataset
Data Collection Method by dataset: Hybrid: Synthetic, Human
Labeling Method by dataset: Hybrid: Synthetic, Human
Properties: A smaller, human-annotated subset of the synthetically created test set. The test set contains conversations on a different domain (banking).

Evaluation Dataset

Link: CantTalkABoutThis evaluation set
Data Collection Method by dataset: Synthetic
Labeling Method by dataset: Synthetic
Properties: Contains 20 multi-turn conversations on 10 different scenarios in the travel domain.

Inference

Engine: TRT-LLM/vLLM/Hugging Face
Test Hardware: A100 80GB

Ethical Considerations

NVIDIA believes Trustworthy AI is a shared responsibility. Developers should ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Report security vulnerabilities or NVIDIA AI Concerns here.

Explainability

Field	Response
Intended Application & Domain	Dialogue Agents and Guardrails
Model Type	Transformer
Intended Users	Developers building task-oriented dialogue assistants who want to specify the dialogue policy in natural language. Also useful as a topical guardrail in NeMo Guardrails.
Output	Text - Binary label determining if the last user turn in the input conversation respects the topical instruction. The label options are either "on-topic" or "off-topic".
Describe how the model works	The model receives the dialogue policy and the current conversation ending with the last user turn in the prompt of a LLM (Llama3.1-8B-Instruct). A binary decision is returned, specifying whether the input is on-topic or not.
Name the adversely impacted groups this has been tested to deliver comparable outcomes regardless of	Not Applicable
Technical Limitations	The model was trained on 9 domains. Strong generalization in other domains is suggested, but thorough testing is recommended for out-of-domain prompts.
Verified to have met prescribed NVIDIA quality standards	Yes
Performance Metrics	F1, Accuracy
Potential Known Risks	Potential risks include the dialogue agent engaging in user content that is not on-topic.
Licensing	Governing NVIDIA Download Terms & Third-Party Component Attribution Terms (Hugging Face LORA weights) GOVERNING TERMS: Use of this model is governed by the NVIDIA Open Model License Agreement. Additional Information: Llama 3.1 Community License Agreement. Built with Llama.

Bias

Field	Response
Participation considerations from adversely impacted groups protected classes in model design and testing	Not Applicable
Measures taken to mitigate against unwanted bias	None

Safety & Security

Field	Response
Model Application(s)	Dialogue agents for topic / dialogue moderation
Describe the life critical impact (if present)	Not Applicable
Use Case Restrictions	Should not be used for any use case other than text-based topic and dialogue moderation in task oriented dialogue agents.
Model and dataset restrictions	Abide by the NVIDIA Open Model License Agreement. Additional Information: Llama 3.1 Community License Agreement. Built with Llama.

Privacy

Field	Response
Generatable or reverse engineerable personal data?	None
Personal data used to create this model?	None
Was consent obtained for any personal data used?	Not Applicable
How often is dataset reviewed?	Before Every Release
Is a mechanism in place to honor data subject right of access or deletion of personal data?	Not Applicable
If personal data was collected for the development of the model, was it coll	Not Applicable

🔧 Technical Details

The model is based on the Llama-3.1-8B-Instruct model from Meta and uses Parameter Efficient FineTuning (PEFT) with specific network architecture parameters. It is trained on a synthetic dataset and tested on a human-annotated subset. The model's performance is evaluated using F1 and accuracy metrics.

📄 License

Use of this model is governed by the NVIDIA Open Model License Agreement. Additional Information: Llama 3.1 Community License Agreement. Built with Llama.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご