TY - JOUR
T1 - Ranking causal anomalies for system fault diagnosis via temporal and dynamical analysis on vanishing correlations
AU - Cheng, Wei
AU - Ni, Jingchao
AU - Zhang, Kai
AU - Chen, Haifeng
AU - Jiang, Guofei
AU - Shi, Yu
AU - Zhang, Xiang
AU - Wang, Wei
N1 - Funding Information:
Wei Wang is partially supported by the National Science Foundation grants IIS-1313606, DBI-1565137, by National Institutes of Health under the grant number R01GM115833-01.
PY - 2017/6
Y1 - 2017/6
N2 - Detecting system anomalies is an important problem in many fields such as security, fault management, and industrial optimization. Recently, invariant network has shown to be powerful in characterizing complex system behaviours. In the invariant network, a node represents a system component and an edge indicates a stable, significant interaction between two components. Structures and evolutions of the invariance network, in particular the vanishing correlations, can shed important light on locating causal anomalies and performing diagnosis. However, existing approaches to detect causal anomalies with the invariant network often use the percentage of vanishing correlations to rank possible casual components, which have several limitations: (1) fault propagation in the network is ignored, (2) the root casual anomalies may not always be the nodes with a high percentage of vanishing correlations, (3) temporal patterns of vanishing correlations are not exploited for robust detection, and (4) prior knowledge on anomalous nodes are not exploited for (semi-)supervised detection. To address these limitations, in this article we propose a network diffusion based framework to identify significant causal anomalies and rank them. Our approach can effectivelymodel fault propagation over the entire invariant network and can perform joint inference on both the structural and the time-evolving broken invariance patterns. As a result, it can locate high-confidence anomalies that are truly responsible for the vanishing correlations and can compensate for unstructuredmeasurement noise in the system. Moreover, when the prior knowledge on the anomalous status of some nodes are available at certain time points, our approach is able to leverage them to further enhance the anomaly inference accuracy. When the prior knowledge is noisy, our approach also automatically learns reliable information and reduces impacts from noises. By performing extensive experiments on synthetic datasets, bank information system datasets, and coal plant cyber-physical system datasets, we demonstrate the effectiveness of our approach.
AB - Detecting system anomalies is an important problem in many fields such as security, fault management, and industrial optimization. Recently, invariant network has shown to be powerful in characterizing complex system behaviours. In the invariant network, a node represents a system component and an edge indicates a stable, significant interaction between two components. Structures and evolutions of the invariance network, in particular the vanishing correlations, can shed important light on locating causal anomalies and performing diagnosis. However, existing approaches to detect causal anomalies with the invariant network often use the percentage of vanishing correlations to rank possible casual components, which have several limitations: (1) fault propagation in the network is ignored, (2) the root casual anomalies may not always be the nodes with a high percentage of vanishing correlations, (3) temporal patterns of vanishing correlations are not exploited for robust detection, and (4) prior knowledge on anomalous nodes are not exploited for (semi-)supervised detection. To address these limitations, in this article we propose a network diffusion based framework to identify significant causal anomalies and rank them. Our approach can effectivelymodel fault propagation over the entire invariant network and can perform joint inference on both the structural and the time-evolving broken invariance patterns. As a result, it can locate high-confidence anomalies that are truly responsible for the vanishing correlations and can compensate for unstructuredmeasurement noise in the system. Moreover, when the prior knowledge on the anomalous status of some nodes are available at certain time points, our approach is able to leverage them to further enhance the anomaly inference accuracy. When the prior knowledge is noisy, our approach also automatically learns reliable information and reduces impacts from noises. By performing extensive experiments on synthetic datasets, bank information system datasets, and coal plant cyber-physical system datasets, we demonstrate the effectiveness of our approach.
UR - http://www.scopus.com/inward/record.url?scp=85023164652&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85023164652&partnerID=8YFLogxK
U2 - 10.1145/3046946
DO - 10.1145/3046946
M3 - Article
AN - SCOPUS:85023164652
VL - 11
JO - ACM Transactions on Knowledge Discovery from Data
JF - ACM Transactions on Knowledge Discovery from Data
SN - 1556-4681
IS - 4
M1 - 3046946
ER -