Discovering reliable evidence of data misuse by exploiting rule redundancy

L. Genga, Nicola Zannone, Anna Squicciarini

Research output: Contribution to journalArticle

Abstract

Big Data offers opportunities for in-depth data analytics and advanced personalized services. Yet, while valuable, data analytics might rely on data that should not have been used due to, e.g., privacy constraints from the data subject or regulations. As decision makers and data controllers often act outside any control mechanism and with no requirement of transparency, it is challenging to verify whether constraints on data usage are actually satisfied. In this work, we relate the problem of finding evidence of data misuse to the identification of unique decision rules, i.e. rules that have likely been used for decision making. Accordingly, we propose an approach to find reliable evidence of data misuse in the context of classification problems using association rule mining, along with novel metrics to assess the level of redundancy among decision rules. Our proposed approach is able to identify the use of sensitive information in decisional processes along with their context. We evaluated our approach through both controlled experiments and two case studies using real-life event data. The results show that our approach finds more reliable evidence of data misuse compared to previous work.

Original languageEnglish (US)
Article number101577
JournalComputers and Security
Volume87
DOIs
StatePublished - Nov 1 2019

Fingerprint

Association rules
redundancy
Transparency
Redundancy
Decision making
Controllers
evidence
Experiments
Big data
transparency
privacy
decision maker
decision making
regulation

All Science Journal Classification (ASJC) codes

  • Computer Science(all)
  • Law

Cite this

@article{3d72bfae2c644f8da3c4129210cb6857,
title = "Discovering reliable evidence of data misuse by exploiting rule redundancy",
abstract = "Big Data offers opportunities for in-depth data analytics and advanced personalized services. Yet, while valuable, data analytics might rely on data that should not have been used due to, e.g., privacy constraints from the data subject or regulations. As decision makers and data controllers often act outside any control mechanism and with no requirement of transparency, it is challenging to verify whether constraints on data usage are actually satisfied. In this work, we relate the problem of finding evidence of data misuse to the identification of unique decision rules, i.e. rules that have likely been used for decision making. Accordingly, we propose an approach to find reliable evidence of data misuse in the context of classification problems using association rule mining, along with novel metrics to assess the level of redundancy among decision rules. Our proposed approach is able to identify the use of sensitive information in decisional processes along with their context. We evaluated our approach through both controlled experiments and two case studies using real-life event data. The results show that our approach finds more reliable evidence of data misuse compared to previous work.",
author = "L. Genga and Nicola Zannone and Anna Squicciarini",
year = "2019",
month = "11",
day = "1",
doi = "10.1016/j.cose.2019.101577",
language = "English (US)",
volume = "87",
journal = "Computers and Security",
issn = "0167-4048",
publisher = "Elsevier Limited",

}

Discovering reliable evidence of data misuse by exploiting rule redundancy. / Genga, L.; Zannone, Nicola; Squicciarini, Anna.

In: Computers and Security, Vol. 87, 101577, 01.11.2019.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Discovering reliable evidence of data misuse by exploiting rule redundancy

AU - Genga, L.

AU - Zannone, Nicola

AU - Squicciarini, Anna

PY - 2019/11/1

Y1 - 2019/11/1

N2 - Big Data offers opportunities for in-depth data analytics and advanced personalized services. Yet, while valuable, data analytics might rely on data that should not have been used due to, e.g., privacy constraints from the data subject or regulations. As decision makers and data controllers often act outside any control mechanism and with no requirement of transparency, it is challenging to verify whether constraints on data usage are actually satisfied. In this work, we relate the problem of finding evidence of data misuse to the identification of unique decision rules, i.e. rules that have likely been used for decision making. Accordingly, we propose an approach to find reliable evidence of data misuse in the context of classification problems using association rule mining, along with novel metrics to assess the level of redundancy among decision rules. Our proposed approach is able to identify the use of sensitive information in decisional processes along with their context. We evaluated our approach through both controlled experiments and two case studies using real-life event data. The results show that our approach finds more reliable evidence of data misuse compared to previous work.

AB - Big Data offers opportunities for in-depth data analytics and advanced personalized services. Yet, while valuable, data analytics might rely on data that should not have been used due to, e.g., privacy constraints from the data subject or regulations. As decision makers and data controllers often act outside any control mechanism and with no requirement of transparency, it is challenging to verify whether constraints on data usage are actually satisfied. In this work, we relate the problem of finding evidence of data misuse to the identification of unique decision rules, i.e. rules that have likely been used for decision making. Accordingly, we propose an approach to find reliable evidence of data misuse in the context of classification problems using association rule mining, along with novel metrics to assess the level of redundancy among decision rules. Our proposed approach is able to identify the use of sensitive information in decisional processes along with their context. We evaluated our approach through both controlled experiments and two case studies using real-life event data. The results show that our approach finds more reliable evidence of data misuse compared to previous work.

UR - http://www.scopus.com/inward/record.url?scp=85069971314&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85069971314&partnerID=8YFLogxK

U2 - 10.1016/j.cose.2019.101577

DO - 10.1016/j.cose.2019.101577

M3 - Article

AN - SCOPUS:85069971314

VL - 87

JO - Computers and Security

JF - Computers and Security

SN - 0167-4048

M1 - 101577

ER -