TY - GEN
T1 - De-Health
T2 - 36th IEEE International Conference on Data Engineering, ICDE 2020
AU - Ji, Shouling
AU - Gu, Qinchen
AU - Weng, Haiqin
AU - Liu, Qianjun
AU - Zhou, Pan
AU - Chen, Jing
AU - Li, Zhao
AU - Beyah, Raheem
AU - Wang, Ting
N1 - Funding Information:
This work was partly supported by the National Key Research and Development Program of China under No. 2018YFB0804102, NSFC under No. 61772466, U1936215, and U1836202, the Zhejiang Provincial Natural Science Foundation for Distinguished Young Scholars under No. LR19F020003, the Provincial Key Research and Development Program of Zhejiang, China under No. 2019C01055, the Ant Financial Research Funding, and the Alibaba-ZJU Joint Research Institute of Frontier Technologies.
Publisher Copyright:
© 2020 IEEE.
PY - 2020/4
Y1 - 2020/4
N2 - In this paper, we study the privacy of online health data. We present a novel online health data De-Anonymization (DA) framework, named De-Health. Leveraging two real world online health datasets WebMD and HealthBoards, we validate the DA efficacy of De-Health. We also present a linkage attack framework which can link online health/medical information to real world people. Through a proof-of-concept attack, we link 347 out of 2805 WebMD users to real world people, and find the full names, medical/health information, birthdates, phone numbers, and other sensitive information for most of the re-identified users. This clearly illustrates the fragility of the privacy of those who use online health forums.
AB - In this paper, we study the privacy of online health data. We present a novel online health data De-Anonymization (DA) framework, named De-Health. Leveraging two real world online health datasets WebMD and HealthBoards, we validate the DA efficacy of De-Health. We also present a linkage attack framework which can link online health/medical information to real world people. Through a proof-of-concept attack, we link 347 out of 2805 WebMD users to real world people, and find the full names, medical/health information, birthdates, phone numbers, and other sensitive information for most of the re-identified users. This clearly illustrates the fragility of the privacy of those who use online health forums.
UR - http://www.scopus.com/inward/record.url?scp=85085865197&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85085865197&partnerID=8YFLogxK
U2 - 10.1109/ICDE48307.2020.00143
DO - 10.1109/ICDE48307.2020.00143
M3 - Conference contribution
AN - SCOPUS:85085865197
T3 - Proceedings - International Conference on Data Engineering
SP - 1609
EP - 1620
BT - Proceedings - 2020 IEEE 36th International Conference on Data Engineering, ICDE 2020
PB - IEEE Computer Society
Y2 - 20 April 2020 through 24 April 2020
ER -