TY - GEN
T1 - Butterfly
T2 - 2008 IEEE 24th International Conference on Data Engineering, ICDE'08
AU - Wang, Ting
AU - Liu, Ling
PY - 2008
Y1 - 2008
N2 - Privacy preservation in data mining demands protecting both input and output privacy. The former refers to sanitizing the raw data itself before performing mining. The latter refers to preventing the mining output (model/pattern) from malicious pattern-based inference attacks. The preservation of input privacy does not necessarily lead to that of output privacy. This work studies the problem of protecting output privacy in the context of frequent pattern mining over data streams. After exposing the privacy breaches existing in current stream mining systems, we propose Butterfly, a light-weighted countermeasure that can effectively eliminate these breaches without explicitly detecting them, meanwhile minimizing the loss of the output accuracy. We further optimize the basic scheme by taking account of two types of semantic constraints, aiming at maximally preserving Utility-related semantics while maintaining the hard privacy and accuracy guarantee. We conduct extensive experiments over real-life datasets to show the effectiveness and efficiency of our approach.
AB - Privacy preservation in data mining demands protecting both input and output privacy. The former refers to sanitizing the raw data itself before performing mining. The latter refers to preventing the mining output (model/pattern) from malicious pattern-based inference attacks. The preservation of input privacy does not necessarily lead to that of output privacy. This work studies the problem of protecting output privacy in the context of frequent pattern mining over data streams. After exposing the privacy breaches existing in current stream mining systems, we propose Butterfly, a light-weighted countermeasure that can effectively eliminate these breaches without explicitly detecting them, meanwhile minimizing the loss of the output accuracy. We further optimize the basic scheme by taking account of two types of semantic constraints, aiming at maximally preserving Utility-related semantics while maintaining the hard privacy and accuracy guarantee. We conduct extensive experiments over real-life datasets to show the effectiveness and efficiency of our approach.
UR - http://www.scopus.com/inward/record.url?scp=52649159141&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=52649159141&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2008.4497526
DO - 10.1109/ICDE.2008.4497526
M3 - Conference contribution
AN - SCOPUS:52649159141
SN - 9781424418374
T3 - Proceedings - International Conference on Data Engineering
SP - 1170
EP - 1179
BT - Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, ICDE'08
Y2 - 7 April 2008 through 12 April 2008
ER -