Foundation-Sec-8B Open-Source Cybersecurity Language Model - Free for Threat Detection and Vulnerability Assessment

Foundation Sec 8B

Developed by fdtn-ai

Foundation-Sec-8B is an 8-billion-parameter foundational language model specifically designed for cybersecurity, based on the Llama-3.1-8B extension, suitable for security scenarios such as threat detection and vulnerability assessment.

Large Language Model

Transformers

EnglishOpen Source License:Apache-2.0 #Cybersecurity Specialized #Threat Intelligence Analysis #Vulnerability Assessment

Downloads 25.24k

Release Time : 4/26/2025

Model Overview

Through continuous pre-training on cybersecurity-specific texts, this model can understand security concepts, terminology, and practices, making it suitable for building locally deployed AI-driven security tools.

Model Features

Cybersecurity Specialization

Through continuous pre-training on curated cybersecurity-specific texts, the Llama-3.1-8B model has been extended to understand security concepts, terminology, and practices across multiple security domains.

On-Premises Deployment Support

Designed for on-premises environments that prioritize data security, regulatory compliance, and operational control, reducing reliance on cloud-based AI services.

High-Performance Security Task Processing

Improves by 3 to 9 points over Llama-3.1-8B on security-specific benchmarks, performing on par with or better than Llama-3.1-70B in network threat intelligence tasks.

Model Capabilities

Text Generation

Security-Related Language Task Processing

Threat Detection

Vulnerability Assessment

Security Automation

Attack Simulation

Use Cases

SOC Acceleration

Automated Classification

Automatically classify security-related emails and leaked document contents.

Improves SOC work efficiency

Summary Generation

Summarize detection manuals and incident reports.

Quickly understand complex security incidents

Proactive Threat Defense

Attack Simulation

Simulate attacks, prioritize vulnerabilities, map TTPs, and model attacker behavior.

Enhance defense strategies

Engineering Empowerment

Security Assistance

Provide security assistance, verify configurations, assess compliance evidence, and improve security posture.

Enhance engineering efficiency

🚀 Foundation-Sec-8B - Model Card

Foundation-Sec-8B is an 8-billion parameter base language model tailored for cybersecurity applications. It extends the Llama-3.1-8B model through continued pre - training on a curated cybersecurity corpus, enabling it to understand security concepts and serve as a base for various security applications.

📦 Installation

No installation steps are provided in the original document, so this section is skipped.

✨ Features

Domain - Adapted: Specialized for cybersecurity through continued pre - training on a curated corpus of cybersecurity - specific text.
Local Deployment: Enables organizations to build and deploy local AI - driven security tools, reducing cloud - based service dependency.
Multiple Use Cases: Designed for threat detection, vulnerability assessment, security automation, and attack simulation.

💻 Usage Examples

Basic Usage

# Import the required libraries
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("fdtn-ai/Foundation-Sec-8B")
model = AutoModelForCausalLM.from_pretrained("fdtn-ai/Foundation-Sec-8B")

# Example: Matching CWE to CVE IDs
prompt="""CVE-2021-44228 is a remote code execution flaw in Apache Log4j2 via unsafe JNDI lookups (“Log4Shell”). The CWE is CWE-502.

CVE-2017-0144 is a remote code execution vulnerability in Microsoft’s SMBv1 server (“EternalBlue”) due to a buffer overflow. The CWE is CWE-119.

CVE-2014-0160 is an information-disclosure bug in OpenSSL’s heartbeat extension (“Heartbleed”) causing out-of-bounds reads. The CWE is CWE-125.

CVE-2017-5638 is a remote code execution issue in Apache Struts 2’s Jakarta Multipart parser stemming from improper input validation of the Content-Type header. The CWE is CWE-20.

CVE-2019-0708 is a remote code execution vulnerability in Microsoft’s Remote Desktop Services (“BlueKeep”) triggered by a use-after-free. The CWE is CWE-416.

CVE-2015-10011 is a vulnerability about OpenDNS OpenResolve improper log output neutralization. The CWE is"""

# Tokenize the input
inputs = tokenizer(prompt, return_tensors="pt")

# Generate the response
outputs = model.generate(
    inputs["input_ids"],
    max_new_tokens=3,
    do_sample=True,
    temperature=0.1,
    top_p=0.9,
)

# Decode and print the response
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
response = response.replace(prompt, "").strip()
print(response)

📚 Documentation

Model Information

Property	Details
Model Name	Foundation - Sec - 8B (Llama - 3.1 - FoundationAI - SecurityLLM - base - 8B)
Model Developer	Amin Karbasi and team at Foundation AI — Cisco
Technical Report	`https://arxiv.org/abs/2504.21039`
Model Card Contact	For general questions: `karbasi@cisco.com`; For technical questions: `paulkass@cisco.com`
Model Release Date	April 28, 2025
Supported Language(s)	English
Model Architecture	Auto - regressive language model using an optimized transformer architecture (Meta Llama - 3.1 - 8B backbone)
Training Objective	Continued pre - training on a cybersecurity - specific corpus
Training Data Status	Static model trained on an offline dataset. Future tuned models will use updated data.
License	Apache 2.0

Intended Use

Intended Use Cases

Foundation - Sec - 8B is designed for security practitioners, researchers, and developers. It is optimized for three core use - case categories:

SOC Acceleration: Automating triage, summarization, case note generation, and evidence collection.
Proactive Threat Defense: Simulating attacks, prioritizing vulnerabilities, mapping TTPs, and modeling attacker behavior.
Engineering Enablement: Providing security assistance, validating configurations, assessing compliance evidence, and improving security posture.

The model is suitable for local deployment in environments prioritizing data security, regulatory compliance, and operational control.

Downstream Use

Foundation - Sec - 8B can be used directly for security - related language tasks and as a starting point for fine - tuning in various cybersecurity workflows, such as:

Summarization
- Summarizing detection playbooks and incident reports.
- Consolidating fragmented analyst notes into structured case summaries.
Classification
- Mapping threats to MITRE ATT&CK techniques.
- Prioritizing vulnerabilities based on contextual risk.
- Classifying security - relevant emails and leaked file contents.
Named Entity Recognition
- Extracting compliance evidence from documents.
- Building network behavior profiles from technical manuals.
Question & Answer
- Assisting SOC analysts with alert triage and investigation.
- Responding to cloud security and software compliance queries.
Reasoning and Text Generation
- Generating red - team attack plans and threat models.
- Predicting attacker next steps in active investigations.
- Enriching vulnerability scan results with contextual insights.

For fine - tuning questions, contact Paul Kassianik (paulkass@cisco.com) or Dhruv Kedia (dkedia@cisco.com).

Out - of - Scope Use

The following uses are not recommended:

Generating harmful content
- Generate malware or other malicious code.
- Create phishing content or social engineering scripts.
- Develop attack plans targeting specific organizations.
- Design exploitation techniques for vulnerabilities without legitimate security research purposes.
Critical security decisions without human oversight
- Autonomous security decision - making without human review.
- Critical infrastructure protection without expert supervision.
- Final determination of security compliance without human verification.
- Autonomous vulnerability remediation without testing.
Legal or medical advice
- Provide legal advice regarding security regulations, compliance requirements, or intellectual property disputes.
- Provide legal advice regarding security issues referencing legal statutes, precedents, or case law.
- Provide medical advice regarding health impacts of security incidents.
Non - security use cases
- The model is optimized for cybersecurity and may not perform well on general tasks.
Violation of Laws or Regulations
- Any use that violates applicable laws or regulations.

Training and Evaluation

Training Data

Foundation - sec - 8B was pre - trained on approximately 5.1 billion tokens of cybersecurity - specific data curated by Cisco’s Foundation AI team. The dataset was collected from public web sources through a multi - stage pipeline including web crawling, relevancy filtering, deduplication, and quality filtering. Data cutoff: April 10th, 2025. More details are available in the technical report.

Training Setup

Foundation - sec - 8B is based on the Llama 3.1 8B architecture and was pre - trained on Cisco Foundation AI’s internal compute cluster. Key training details:

Continued pre - training for cybersecurity specialization.
4096 - token sequence length.
Optimizer: AdamW. More details are available in the technical report.

Evaluation

Foundation - sec - 8B was benchmarked on cybersecurity and general reasoning tasks using a 5 - shot prompting setup (temperature = 0.3).

Benchmark	Foundation - sec - 8B	Llama 3.1 8B	Llama 3.1 70B
CTI - MCQA	67.39	64.14	68.23
CTI - RCM	75.26	66.43	72.66

Benchmark Overview:

CTI - MCQA: 2,500 multiple - choice questions testing cybersecurity knowledge across frameworks like MITRE ATT&CK, NIST, GDPR, and threat intelligence best practices.
CTI - RCM: 900+ vulnerability root cause mapping examples linking CVEs to CWE categories, assessing deep understanding of security weaknesses.

Key highlights:

+3 to +9 point gains over Llama - 3.1 - 8B in security - specific benchmarks.
Comparable or better performance than Llama - 3.1 - 70B on cyber threat intelligence tasks.
Minimal drop (~2%) in general language reasoning (MMLU) despite cybersecurity specialization.

For full benchmark details, refer to the technical report.

Limitations

Foundation - Sec - 8B has several limitations:

Domain - specific knowledge limitations
- May not be familiar with recent vulnerabilities, exploits, or novel attack vectors after the training cutoff date.
- Limited knowledge of specialized or proprietary security systems or tools.
Potential biases
- Reflect biases in security literature and documentation.
- Difficulty recognizing novel attack vectors due to training on known patterns.
- Security practices may be biased towards certain technological ecosystems.
- Geographic and cultural biases in security approaches.
Security risks
- Cannot verify user identity or intentions.
- Adversarial prompting may bypass safety mechanisms.
- May unintentionally provide misusable information without proper prompting guardrails.
Contextual blindness
- Struggle to understand complex interrelationships between systems, users, and data for accurate context.
Technical limitations
- Performance varies based on prompt descriptions of security concepts.
- May not fully understand complex, multi - step security scenarios without clear explanations.
- Cannot access external systems or actively scan environments.
- Cannot independently verify output factual accuracy.
Ethical considerations
- The dual - use nature of security knowledge requires careful consideration of appropriate use cases.

Recommendations

To address the limitations:

Human oversight
- Have qualified security professionals review model outputs before implementation.
- Use the model as an assistive tool, not a replacement for expert judgment.
- Implement a human - in - the - loop approach for security - critical applications.
System design safeguards
- Implement additional validation layers for applications.
- Consider architectural constraints to limit potentially harmful actions.
- Deploy the model in environments with appropriate access controls.
Prompt engineering
- Use carefully designed prompts for ethical security practices.
- Include explicit instructions on responsible disclosure and ethical hacking principles.
- Structure interactions to minimize harmful output risks.
Knowledge supplementation
- Supplement the model with up - to - date security feeds and databases.
- Implement retrieval - augmented generation for current threat intelligence sources.
Usage policies
- Develop and enforce clear acceptable use policies for applications.
- Implement monitoring and auditing for high - risk applications.
- Create end - user documentation about the model's limitations.

🔧 Technical Details

Model Architecture: Auto - regressive language model using an optimized transformer architecture (Meta Llama - 3.1 - 8B backbone).
Training Setup: Continued pre - training on a cybersecurity - specific corpus, 4096 - token sequence length, AdamW optimizer.

📄 License

The model is licensed under the Apache 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご