LP-explain: Local pictorial explanation for outliers

Haoyu Liu, Fenglong Ma, Yaqing Wang, Shibo He, Jiming Chen, Jing Gao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Outlier detection is of vital importance for various fields and applications. Existing works mainly focus on identifying outliers from underlying datasets, while how to provide sense-making explanations is largely ignored. In this paper, we propose to visualize data points in a set of scatter plots on two-dimensional (2-D) feature spaces that can provide meaningful explanations about the outlying behavior of outliers. Data are typically multidimensional and the number of 2-D combinations could be huge. Also, outliers may have diverse characteristics, and thus the global scatter plots containing all of outliers may degrade the explanation effectiveness for those outliers having idiosyncratic abnormal 2-D spaces. To address this problem, we propose a new outlier explanation approach, called LP-Explain, which tries to identify the set of best Local Pictorial explanations (defined as the scatter plots in the 2-D space of the feature pairs) that can Explain the behavior for cluster of outliers. We first define an effective measure to quantify the similarity between outliers, and then cluster outliers into different groups based on their abnormal feature pairs. We then propose to weigh the importance of feature pairs within each cluster through a multi-task learning framework to select the set of top feature pairs that best explain various outlier clusters. By adjusting a user-defined parameter indicating the 'localization level', the proposed method can attain both global and local results for the explanation of the outliers. 2-D visual explanations can be plotted for the top-weighted feature pairs of each cluster. We conduct experiments on various public datasets, which show that the proposed approach can provide more meaningful explanations about the outlying behavior in a dataset.

Original languageEnglish (US)
Title of host publicationProceedings - 20th IEEE International Conference on Data Mining, ICDM 2020
EditorsClaudia Plant, Haixun Wang, Alfredo Cuzzocrea, Carlo Zaniolo, Xindong Wu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages372-381
Number of pages10
ISBN (Electronic)9781728183169
DOIs
StatePublished - Nov 2020
Event20th IEEE International Conference on Data Mining, ICDM 2020 - Virtual, Sorrento, Italy
Duration: Nov 17 2020Nov 20 2020

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
Volume2020-November
ISSN (Print)1550-4786

Conference

Conference20th IEEE International Conference on Data Mining, ICDM 2020
CountryItaly
CityVirtual, Sorrento
Period11/17/2011/20/20

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Fingerprint Dive into the research topics of 'LP-explain: Local pictorial explanation for outliers'. Together they form a unique fingerprint.

Cite this