đ Llama-Primus-Nemotron-70B-Instruct
The Llama-Primus-Nemotron-70B-Instruct model is an enhanced text - generation model in the field of cybersecurity, achieving significant improvements in relevant benchmarks.
đ Quick Start
This README provides detailed information about the Llama - Primus - Nemotron - 70B - Instruct model, including its introduction, benchmark results, training datasets, and acknowledgments.
⨠Features
- Enhanced Performance: Achieves an 18.18% improvement in aggregate scores across several public cybersecurity benchmarks.
- Cybersecurity Focus: Built upon large - scale cybersecurity corpora training.
- Safety and Toxicity Resistance: Demonstrates better performance in safety and toxicity metrics.
đĻ Installation
No installation steps are provided in the original document, so this section is skipped.
đģ Usage Examples
No code examples are provided in the original document, so this section is skipped.
đ Documentation
Introduction
The Llama - Primus - Nemotron series builds upon nvidia/Llama - 3.1 - Nemotron - 70B - Instruct
through continued training. Following the same methodology as described in the Primus paper, we first performed pre - training on large - scale cybersecurity corpora (over 10B tokens) to obtain Llama - Primus - Nemotron - Base. We then conducted supervised - finetuning and applied DELLA to merge with the original Nemotron, resulting in Llama - Primus - Nemotron - 70B - Instruct.
Llama - Primus - Nemotron - 70B - Instruct achieves an 18.18% improvement in aggregate scores across several public cybersecurity benchmarks, while maintaining same performance in general - purpose instruction following benchmark (Arena Hard).
Benchmark Results
Cybersecurity
Metric (5 - shot, w/ chat template) |
Llama - 3.1 - Nemotron - 70B - Instruct |
Llama - Primus - Nemotron - 70B - Instruct |
CTI - Bench (MCQ) |
0.6320 |
0.7148 |
CTI - Bench (CVE â CWE) |
0.6020 |
0.6770 |
CTI - Bench (CVSS, lower is better) |
1.4523 |
1.2469 |
CTI - Bench (ATE) |
0.4284 |
0.5039 |
CyberMetric (500) |
0.9240 |
0.9280 |
SecEval |
0.6875 |
0.7095 |
CISSP (Exam Questions) |
0.8428 |
0.8625 |
Aggregate |
2.6644 |
3.1488 â18.18% đĨ |
CTI - Bench(CVSS) is scored using Mean Absolute Deviation (lower is better), CTI - ATE uses F1 score, and the others use accuracy. The aggregate score (Agg.) is the sum of all benchmarks, with CTI - Bench(CVSS) negated.
References:
General Chat Performance
Metric |
Llama - 3.1 - Nemotron - 70B - Instruct |
Llama - Primus - Nemotron - 70B - Instruct |
Arena Hard |
85.1 |
85.8 |
Reference:
Safety & Toxicity
Metric |
Llama - 3.1 - Nemotron - 70B - Instruct |
Primus - Labor - 70B (Llama - 3.1 - Nemotron - 70B - Instruct) đĨ |
dan (Jailbreak) |
43.14% |
61.96% |
encoding (Jailbreak) |
93.37% |
96.87% |
goodside (Hallucination / Prompt Injection) |
75.00% |
72.50% |
latentinjection (Prompt Injection) |
62.46% |
70.35% |
leakreplay (Copyright) |
88.23% |
92.43% |
malwaregen (Disallowed content) |
18.99% |
25.84% |
realtoxicityprompts (Disallowed content) |
97.55% |
98.25% |
snowball (Hallucination) |
100.00% |
100.00% |
xss (Prompt Injection) |
81.67% |
100.00% |
XSTest (Over Refusal) |
94.40% |
97.20% |
References:
Training Datasets
Pre - training:
- Primus - Seed - V2 (0.417B): An enhanced version of Primus - Seed, enriched with blogs, news, books, websites, Wikipedia, MITRE and Trend Micro knowledge.
- Primus - FineWeb (2.57B): Cybersecurity text filtered from FineWeb - edu - score - 2. Link
- Primus - Nemotron - CC (7.6B): Cybersecurity text filtered from Nemotron - CC.
SFT:
â ī¸ Important Note
Datasets Primus - Seed - V2 and Primus - Nemotron - CC are not yet open - sourced and are currently under discussion. Feel free to reach out if you're interested.
đĄ Usage Tip
No Trend Micro customer information is included.
About Primus
Primus is Trend Micro's pioneering family of lightweight, state - of - the - art open cybersecurity language models and datasets. Developed through our cutting - edge research initiatives and advanced technology, these resources share the innovative foundation that powers our enterprise - class Trend Cybertron solution. As an industry leader in cybersecurity, Trend Micro is proud to contribute these powerful, efficiency - optimized models and datasets to the community, while maintaining the excellence and reliability that define our global security standards.
Acknowledgments
We would like to thank NVIDIA for generously providing computing resources (Taipei - 1), which enabled the training and development of this model.
đ License
This model is based on the MIT license, but you must also comply with the Llama 3.1 Community License Agreement.
Property |
Details |
Model Type |
Text Generation |
Base Model |
trend - cybertron/Llama - Primus - Nemotron - 70B - Base |
Training Datasets |
trendmicro - ailab/Primus - FineWeb, trendmicro - ailab/Primus - Instruct |
Library Name |
transformers |
Tags |
cybersecurity |
License |
MIT (also comply with Llama 3.1 Community License Agreement) |
Extra Gated Fields |
Affiliation (text), Country (country), I want to use this model for (Research, Commercial, Other), Job title (Student, Research graduate, AI researcher, AI developer/engineer, Cybersecurity researcher, Reporter, Other), geo (ip_location) |