đ Foundation-Sec-8B-Q8_0-GGUF Model Card
This model is a quantized version of the fdtn-ai/Foundation-Sec-8B. It's been transformed into an 8-bit (Q8_0) GGUF checkpoint using llama.cpp. While it maintains the original 8-billion-parameter model's focus on cybersecurity, it significantly cuts down the memory usage for inference. The memory requirement drops from about 16GB (BF16) to around 8.54GB (Q8_0).
⨠Features
- Cybersecurity Specialization: Continued-pretrained on a curated cybersecurity corpus, excelling at tasks like threat intelligence summarization, vulnerability classification, incident triage assistance, and generating red - team simulation prompts and security workflows.
- Quantization Benefits: Reduces memory footprint while retaining the model's capabilities.
đĻ Installation
Install llama.cpp on Mac
You can use Homebrew:
brew install llama-cpp
Or install from scratch:
brew install cmake
git clone https://github.com/ggml-org/llama.cpp.git
cd llama.cpp
make
sudo cp llama-cli /usr/local/bin/
đģ Usage Examples
Basic Usage
llama-cli -m foundation-sec-8b-q8_0.gguf -p "CVE-2021-44228 is a remote code execution flaw in Apache Log4j2 via unsafe JNDI lookups (\"Log4Shell\"). The CWE is CWE-502.\n\nCVE-2017-0144 is a remote code execution vulnerability in Microsoft's SMBv1 server (\"EternalBlue\") due to a buffer overflow. The CWE is CWE-119.\n\nCVE-2014-0160 is an information-disclosure bug in OpenSSL's heartbeat extension (\"Heartbleed\") due to out-of-bounds reads. The CWE is CWE-125.\n\nCVE-2017-5638 is a remote code execution issue in Apache Struts 2's Jakarta Multipart parser stemming from improper input validation of the Content-Type header. The CWE is CWE-20.\n\nCVE-2019-0708 is a remote code execution vulnerability in Microsoft's Remote Desktop Services (\"BlueKeep\") triggered by a use-after-free. The CWE is CWE-416.\n\nCVE-2015-10011 is a vulnerability about OpenDNS OpenResolve improper log output neutralization. The CWE is" -n 128
đ Documentation
Model Description
fdtn-ai/Foundation-Sec-8B-Q8_0-GGUF
is an 8-bit quantized version of Foundation-Sec-8B. The base model, with 8B parameters, is based on LLaMA 3.1 and was further pretrained on a carefully selected cybersecurity text corpus (such as CVEs, threat intel reports, exploit write - ups, and compliance guides). It was initially released on April 28, 2025, under the Apache 2.0 license. The model is great at tasks like:
- Threat intelligence summarization (e.g., summarizing CVE details)
- Vulnerability classification (mapping CVEs/CWEs to MITRE ATT&CK)
- Incident triage assistance (extracting IoCs, summarizing log data)
- Red - team simulation prompts and security - workflow generation
For the foundational architecture, training data, evaluation results, and known limitations, please refer to the original model card.
Quantization Details
- Quantization Scheme: 8-bit, "Q8_0" (8-bit quantization with minimal precision loss)
- Toolchain: Converted to GGUF format via llama.cpp's export utilities (commit
v0.1.81
or newer).
- Resulting File Size: Approximately 8.54 GB on disk (raw GGUF blob)
- Runtime Footprint:
- Memory: Around 8.54 GB of RAM when loaded on CPU with llama.cpp
- Format:
- File extension:
.gguf
- Internally contains:
- Metadata (architecture, tokenizer vocab, hyperparameters)
- Vocabulary list (BPE tokens)
- Weight tensors (for each layer and head) stored in 8-bit quantized form
- Compliant with LlamaCpp Python wrapper (
llama_cpp
) and C++ CLI (llama.cpp
) inference engines
đ License
This model is licensed under the Apache 2.0 license, the same as the base model.
References
- Original Model Card:
fdtn-ai/Foundation-Sec-8B (April 28, 2025) â continued pretraining of LLaMA 3.1 - 8B on cybersecurity data.
- Llama - cpp GGUF Quantization:
Ggerganov, J. (2022). Llama.cpp: Llama inference in pure C/C++/Assembly/GGUF. GitHub repository.
- ZeroQuant:
Yao, Z. et al. (2022). "ZeroQuant: Efficient and Affordable Post - Training Quantization for Large - Scale Transformers." arXiv: 2206.01861.
- SmoothQuant:
Xiao, G. et al. (2022). "SmoothQuant: Accurate and Efficient Post - Training Quantization for Large Language Models." arXiv: 2211.10438.
Contact: For questions about usage, quantization details, or license terms, please open an issue on the Hugging Face repo or contact paulkass@cisco.com
.