MITRE-v15-tactic-bert-case-based Open Source Model - Helping to Identify Tactical Categories in Cybersecurity Texts

MITRE V15 Tactic Bert Case Based

Developed by sarahwei

This model is based on the BERT architecture, specifically designed to identify the MITRE ATT&CK tactic framework categories for cybersecurity texts, supporting multi-label classification.

Text Classification

Transformers

EnglishOpen Source License:Apache-2.0 #Cybersecurity Text Classification #MITRE Tactic Identification #Multi-label Classification

Downloads 283

Release Time : 6/25/2024

Model Overview

This model is fine-tuned on the MITRE ATT&CK version 15 attack process dataset, suitable for text classification tasks in the cybersecurity field, capable of identifying the corresponding attack tactic categories for sentences.

Model Features

Multi-label Classification

Supports classification tasks where a single sentence can correspond to multiple tactic categories

Domain Optimization

Specifically optimized for attack behavior descriptions in the cybersecurity field

High Accuracy

Achieves an accuracy of 0.87 on the validation set

Model Capabilities

Cybersecurity Text Classification

Attack Tactic Identification

Multi-label Text Classification

Use Cases

Cybersecurity Analysis

Attack Behavior Classification

Identify the MITRE tactic categories of attack behaviors described in security reports

Can accurately classify attack behaviors such as SQL injection

Threat Intelligence Analysis

Automate the processing of large volumes of threat intelligence data to quickly classify attack tactics

Improves the efficiency of security analysts

🚀 MITRE-v15-tactic-bert-case-based

This is a fine-tuned model derived from mitre-bert-base-cased on the MITRE ATT&CK version 15 procedure dataset. On the evaluation dataset, it achieves the following performance metrics:

Loss: 0.057
Accuracy: 0.87

✨ Features

Text Classification: This fine-tuned model can be used for text classification, aiming to identify the tactic that a sentence belongs to within the MITRE ATT&CK framework. A single sentence or attack may fall into multiple tactics.
Cybersecurity Focus: Primarily fine-tuned for text classification in the field of cybersecurity.

📦 Installation

No specific installation steps are provided in the original document.

💻 Usage Examples

Basic Usage

You can use the model with Tensorflow as follows:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_id = "sarahwei/MITRE-tactic-bert-case-based"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    # device_map="auto",
)
question = 'An attacker performs a SQL injection.'
input_ids = tokenizer(question,return_tensors="pt")
outputs = model(**input_ids)
logits = outputs.logits
sigmoid = torch.nn.Sigmoid()
probs = sigmoid(logits.squeeze().cpu())
predictions = np.zeros(probs.shape)
predictions[np.where(probs >= 0.5)] = 1
predicted_labels = [model.config.id2label[idx] for idx, label in enumerate(predictions) if label == 1.0]

Advanced Usage

There is no advanced usage example in the original document.

📚 Documentation

Intended uses & limitations

You can utilize the fine-tuned model for text classification, specifically to identify the tactic that a sentence belongs to in the MITRE ATT&CK framework. Note that a sentence or an attack might be associated with multiple tactics.

It's important to note that this model is mainly fine-tuned for text classification in the context of cybersecurity. It may not yield satisfactory results if the input sentence is not related to attacks.

Training procedure

Training parameter

Property	Details
Learning Rate	5e-05
Train Batch Size	8
Eval Batch Size	8
Seed	0
Optimizer	Adam with betas=(0.9, 0.999) and epsilon=1e-08
LR Scheduler Type	linear
Num Epochs	10
Warmup Ratio	0.01
Weight Decay	0.001

Training results

Step	Training Loss	Validation Loss	F1	Roc AUC	Accuracy
100	0.409400	0.142982	0.740000	0.803830	0.610000
200	0.106500	0.093503	0.818182	0.868382	0.720000
300	0.070200	0.065937	0.893617	0.930366	0.810000
400	0.045500	0.061865	0.892704	0.926625	0.830000
500	0.033600	0.057814	0.902954	0.938630	0.860000
600	0.026000	0.062982	0.894515	0.934107	0.840000
700	0.021900	0.056275	0.904564	0.946113	0.870000
800	0.017700	0.061058	0.887967	0.937067	0.860000
900	0.016100	0.058965	0.890756	0.933716	0.870000
1000	0.014200	0.055885	0.903766	0.942372	0.880000
1100	0.013200	0.056888	0.895397	0.937849	0.880000
1200	0.012700	0.057484	0.895397	0.937849	0.870000

📄 License

This model is licensed under the Apache 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご