Foundation-Sec-8B-Q8_0-GGUF Open Source Model - Free Deployment to Solve Network Security Text Processing Challenges

Foundation Sec 8B Q8 0 GGUF

Developed by fdtn-ai

Foundation-Sec-8B-Q8_0-GGUF is a large language model specialized in network security that has undergone 8-bit quantization. It is based on the LLaMA 3.1 architecture and focuses on network security text processing tasks.

Large Language Model

Transformers

EnglishOpen Source License:Apache-2.0 #For network security only #8-bit quantization inference #Threat intelligence analysis

Downloads 143

Release Time : 5/31/2025

Model Overview

This model is an 8-bit quantized version of Foundation-Sec-8B. It retains the professional capabilities of the original 8-billion parameter model in the field of network security while significantly reducing the memory usage during inference. It is suitable for network security-related tasks such as threat intelligence analysis and vulnerability classification.

Model Features

Efficient quantization

Using 8-bit quantization technology, the inference memory usage is reduced from approximately 16GB to approximately 8.54GB while maintaining model performance.

Professional network security capabilities

It has undergone continuous pre-training on a carefully curated corpus of network security-specific text and has professional network security text processing capabilities.

Wide compatibility

Supports multiple inference engines, including llama.cpp and the LlamaCpp Python wrapper.

Model Capabilities

Threat intelligence summarization

Vulnerability classification

Event classification assistance

Red team simulation prompts

Security workflow generation

Use Cases

Network security analysis

CVE summarization

Automatically summarize detailed CVE information and extract key security information.

Quickly understand the nature and impact scope of vulnerabilities

Vulnerability classification

Map CVE/CWE to the MITRE ATT&CK framework.

Standardize vulnerability classification for easy evaluation by security teams

Security operations

Log analysis

Extract key indicators and events from security log data.

Accelerate the event response process

Red team simulation

Generate red team test scenarios and attack simulation prompts.

Enhance the effectiveness of defense drills

🚀 Foundation-Sec-8B-Q8_0-GGUF Model Card

This model is a quantized version of the fdtn-ai/Foundation-Sec-8B. It's been transformed into an 8-bit (Q8_0) GGUF checkpoint using llama.cpp. While it maintains the original 8-billion-parameter model's focus on cybersecurity, it significantly cuts down the memory usage for inference. The memory requirement drops from about 16GB (BF16) to around 8.54GB (Q8_0).

✨ Features

Cybersecurity Specialization: Continued-pretrained on a curated cybersecurity corpus, excelling at tasks like threat intelligence summarization, vulnerability classification, incident triage assistance, and generating red - team simulation prompts and security workflows.
Quantization Benefits: Reduces memory footprint while retaining the model's capabilities.

📦 Installation

Install llama.cpp on Mac

You can use Homebrew:

brew install llama-cpp

Or install from scratch:

# Install dependencies
brew install cmake

# Clone and build llama.cpp
git clone https://github.com/ggml-org/llama.cpp.git
cd llama.cpp
make

# Add to PATH (optional)
sudo cp llama-cli /usr/local/bin/

💻 Usage Examples

Basic Usage

llama-cli -m foundation-sec-8b-q8_0.gguf -p "CVE-2021-44228 is a remote code execution flaw in Apache Log4j2 via unsafe JNDI lookups (\"Log4Shell\"). The CWE is CWE-502.\n\nCVE-2017-0144 is a remote code execution vulnerability in Microsoft's SMBv1 server (\"EternalBlue\") due to a buffer overflow. The CWE is CWE-119.\n\nCVE-2014-0160 is an information-disclosure bug in OpenSSL's heartbeat extension (\"Heartbleed\") due to out-of-bounds reads. The CWE is CWE-125.\n\nCVE-2017-5638 is a remote code execution issue in Apache Struts 2's Jakarta Multipart parser stemming from improper input validation of the Content-Type header. The CWE is CWE-20.\n\nCVE-2019-0708 is a remote code execution vulnerability in Microsoft's Remote Desktop Services (\"BlueKeep\") triggered by a use-after-free. The CWE is CWE-416.\n\nCVE-2015-10011 is a vulnerability about OpenDNS OpenResolve improper log output neutralization. The CWE is" -n 128

📚 Documentation

Model Description

fdtn-ai/Foundation-Sec-8B-Q8_0-GGUF is an 8-bit quantized version of Foundation-Sec-8B. The base model, with 8B parameters, is based on LLaMA 3.1 and was further pretrained on a carefully selected cybersecurity text corpus (such as CVEs, threat intel reports, exploit write - ups, and compliance guides). It was initially released on April 28, 2025, under the Apache 2.0 license. The model is great at tasks like:

Threat intelligence summarization (e.g., summarizing CVE details)
Vulnerability classification (mapping CVEs/CWEs to MITRE ATT&CK)
Incident triage assistance (extracting IoCs, summarizing log data)
Red - team simulation prompts and security - workflow generation

For the foundational architecture, training data, evaluation results, and known limitations, please refer to the original model card.

Quantization Details

Quantization Scheme: 8-bit, "Q8_0" (8-bit quantization with minimal precision loss)
Toolchain: Converted to GGUF format via llama.cpp's export utilities (commit v0.1.81 or newer).
Resulting File Size: Approximately 8.54 GB on disk (raw GGUF blob)
Runtime Footprint:
- Memory: Around 8.54 GB of RAM when loaded on CPU with llama.cpp
Format:
- File extension: .gguf
- Internally contains:
  1. Metadata (architecture, tokenizer vocab, hyperparameters)
  2. Vocabulary list (BPE tokens)
  3. Weight tensors (for each layer and head) stored in 8-bit quantized form
- Compliant with LlamaCpp Python wrapper (llama_cpp) and C++ CLI (llama.cpp) inference engines

📄 License

This model is licensed under the Apache 2.0 license, the same as the base model.

References

Original Model Card: fdtn-ai/Foundation-Sec-8B (April 28, 2025) – continued pretraining of LLaMA 3.1 - 8B on cybersecurity data.
Llama - cpp GGUF Quantization: Ggerganov, J. (2022). Llama.cpp: Llama inference in pure C/C++/Assembly/GGUF. GitHub repository.
ZeroQuant: Yao, Z. et al. (2022). "ZeroQuant: Efficient and Affordable Post - Training Quantization for Large - Scale Transformers." arXiv: 2206.01861.
SmoothQuant: Xiao, G. et al. (2022). "SmoothQuant: Accurate and Efficient Post - Training Quantization for Large Language Models." arXiv: 2211.10438.

Contact: For questions about usage, quantization details, or license terms, please open an issue on the Hugging Face repo or contact paulkass@cisco.com.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご