Unsupervised parsimonious cluster-based anomaly detection (PCAD)

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Group anomaly detection (AD), i.e. detection of clusters of anomalous samples in a test batch, with the samples in a given such cluster exhibiting a common pattern of atypicality (relative to a null model) has important applications to discovering unknown classes present in a test data batch and, equivalently, to zero-day threat detection in a security context. When the feature space is large, clusters may manifest anomalies on very small feature subsets, which is well-captured by the parsimonious mixture modelling (PMM) framework. Thus, we propose a generalized likelihood ratio test (GLRT-like) group AD framework, with PMMs used for both the null and the alternative hypothesis (that an anomalous cluster is present), and with the Bayesian Information Criterion (BIC) used to adjudicate between these hypotheses. We demonstrate our approach on network traffic data sets, detecting Zeus (web) bots and peer-to-peer traffic as zero-day activities. Our PCAD achieves substantially better detection results than a previous group AD method applied to this domain.

Original languageEnglish (US)
Title of host publication2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings
EditorsNelly Pustelnik, Zheng-Hua Tan, Zhanyu Ma, Jan Larsen
PublisherIEEE Computer Society
Volume2018-September
ISBN (Electronic)9781538654774
DOIs
StatePublished - Oct 31 2018
Event28th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Aalborg, Denmark
Duration: Sep 17 2018Sep 20 2018

Other

Other28th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018
CountryDenmark
CityAalborg
Period9/17/189/20/18

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Signal Processing

Cite this

Miller, D. J., Kesidis, G., & Qiu, Z. (2018). Unsupervised parsimonious cluster-based anomaly detection (PCAD). In N. Pustelnik, Z-H. Tan, Z. Ma, & J. Larsen (Eds.), 2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings (Vol. 2018-September). [8517014] IEEE Computer Society. https://doi.org/10.1109/MLSP.2018.8517014
Miller, David Jonathan ; Kesidis, George ; Qiu, Zhicong. / Unsupervised parsimonious cluster-based anomaly detection (PCAD). 2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings. editor / Nelly Pustelnik ; Zheng-Hua Tan ; Zhanyu Ma ; Jan Larsen. Vol. 2018-September IEEE Computer Society, 2018.
@inproceedings{e8f26ff28efe48faa8453fd65e66c029,
title = "Unsupervised parsimonious cluster-based anomaly detection (PCAD)",
abstract = "Group anomaly detection (AD), i.e. detection of clusters of anomalous samples in a test batch, with the samples in a given such cluster exhibiting a common pattern of atypicality (relative to a null model) has important applications to discovering unknown classes present in a test data batch and, equivalently, to zero-day threat detection in a security context. When the feature space is large, clusters may manifest anomalies on very small feature subsets, which is well-captured by the parsimonious mixture modelling (PMM) framework. Thus, we propose a generalized likelihood ratio test (GLRT-like) group AD framework, with PMMs used for both the null and the alternative hypothesis (that an anomalous cluster is present), and with the Bayesian Information Criterion (BIC) used to adjudicate between these hypotheses. We demonstrate our approach on network traffic data sets, detecting Zeus (web) bots and peer-to-peer traffic as zero-day activities. Our PCAD achieves substantially better detection results than a previous group AD method applied to this domain.",
author = "Miller, {David Jonathan} and George Kesidis and Zhicong Qiu",
year = "2018",
month = "10",
day = "31",
doi = "10.1109/MLSP.2018.8517014",
language = "English (US)",
volume = "2018-September",
editor = "Nelly Pustelnik and Zheng-Hua Tan and Zhanyu Ma and Jan Larsen",
booktitle = "2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings",
publisher = "IEEE Computer Society",
address = "United States",

}

Miller, DJ, Kesidis, G & Qiu, Z 2018, Unsupervised parsimonious cluster-based anomaly detection (PCAD). in N Pustelnik, Z-H Tan, Z Ma & J Larsen (eds), 2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings. vol. 2018-September, 8517014, IEEE Computer Society, 28th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018, Aalborg, Denmark, 9/17/18. https://doi.org/10.1109/MLSP.2018.8517014

Unsupervised parsimonious cluster-based anomaly detection (PCAD). / Miller, David Jonathan; Kesidis, George; Qiu, Zhicong.

2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings. ed. / Nelly Pustelnik; Zheng-Hua Tan; Zhanyu Ma; Jan Larsen. Vol. 2018-September IEEE Computer Society, 2018. 8517014.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Unsupervised parsimonious cluster-based anomaly detection (PCAD)

AU - Miller, David Jonathan

AU - Kesidis, George

AU - Qiu, Zhicong

PY - 2018/10/31

Y1 - 2018/10/31

N2 - Group anomaly detection (AD), i.e. detection of clusters of anomalous samples in a test batch, with the samples in a given such cluster exhibiting a common pattern of atypicality (relative to a null model) has important applications to discovering unknown classes present in a test data batch and, equivalently, to zero-day threat detection in a security context. When the feature space is large, clusters may manifest anomalies on very small feature subsets, which is well-captured by the parsimonious mixture modelling (PMM) framework. Thus, we propose a generalized likelihood ratio test (GLRT-like) group AD framework, with PMMs used for both the null and the alternative hypothesis (that an anomalous cluster is present), and with the Bayesian Information Criterion (BIC) used to adjudicate between these hypotheses. We demonstrate our approach on network traffic data sets, detecting Zeus (web) bots and peer-to-peer traffic as zero-day activities. Our PCAD achieves substantially better detection results than a previous group AD method applied to this domain.

AB - Group anomaly detection (AD), i.e. detection of clusters of anomalous samples in a test batch, with the samples in a given such cluster exhibiting a common pattern of atypicality (relative to a null model) has important applications to discovering unknown classes present in a test data batch and, equivalently, to zero-day threat detection in a security context. When the feature space is large, clusters may manifest anomalies on very small feature subsets, which is well-captured by the parsimonious mixture modelling (PMM) framework. Thus, we propose a generalized likelihood ratio test (GLRT-like) group AD framework, with PMMs used for both the null and the alternative hypothesis (that an anomalous cluster is present), and with the Bayesian Information Criterion (BIC) used to adjudicate between these hypotheses. We demonstrate our approach on network traffic data sets, detecting Zeus (web) bots and peer-to-peer traffic as zero-day activities. Our PCAD achieves substantially better detection results than a previous group AD method applied to this domain.

UR - http://www.scopus.com/inward/record.url?scp=85057051473&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85057051473&partnerID=8YFLogxK

U2 - 10.1109/MLSP.2018.8517014

DO - 10.1109/MLSP.2018.8517014

M3 - Conference contribution

AN - SCOPUS:85057051473

VL - 2018-September

BT - 2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings

A2 - Pustelnik, Nelly

A2 - Tan, Zheng-Hua

A2 - Ma, Zhanyu

A2 - Larsen, Jan

PB - IEEE Computer Society

ER -

Miller DJ, Kesidis G, Qiu Z. Unsupervised parsimonious cluster-based anomaly detection (PCAD). In Pustelnik N, Tan Z-H, Ma Z, Larsen J, editors, 2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings. Vol. 2018-September. IEEE Computer Society. 2018. 8517014 https://doi.org/10.1109/MLSP.2018.8517014