Robust Traceability from Trace Amounts

Cynthia Dwork, Adam Davison Smith, Thomas Steinke, Jonathan Ullman, Salil Vadhan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

21 Citations (Scopus)

Abstract

The privacy risks inherent in the release of a large number of summary statistics were illustrated by Homer et al. (PLoS Genetics, 2008), who considered the case of 1-way marginals of SNP allele frequencies obtained in a genome-wide association study: Given a large number of minor allele frequencies from a case group of individuals diagnosed with a particular disease, together with the genomic data of a single target individual and statistics from a sizable reference dataset independently drawn from the same population, an attacker can determine with high confidence whether or not the target is in the case group. In this work we describe and analyze a simple attack that succeeds even if the summary statistics are significantly distorted, whether due to measurement error or noise intentionally introduced to protect privacy. Our attack only requires that the vector of distorted summary statistics is close to the vector of true marginals in L1 norm. Moreover, the reference pool required by previous attacks can be replaced by a single sample drawn from the underlying population. The new attack, which is not specific to genomics and which handles Gaussian as well as Bernouilli data, significantly generalizes recent lower bounds on the noise needed to ensure differential privacy (Bun, Ullman, and Vadhan, STOC 2014, Steinke and Ullman, 2015), obviating the need for the attacker to control the exact distribution of the data.

Original languageEnglish (US)
Title of host publicationProceedings - 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015
PublisherIEEE Computer Society
Pages650-669
Number of pages20
ISBN (Electronic)9781467381918
DOIs
StatePublished - Dec 11 2015
Event56th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2015 - Berkeley, United States
Duration: Oct 17 2015Oct 20 2015

Publication series

NameProceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS
Volume2015-December
ISSN (Print)0272-5428

Other

Other56th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2015
CountryUnited States
CityBerkeley
Period10/17/1510/20/15

Fingerprint

Statistics
Measurement errors
Genes
Genetics
Genomics

All Science Journal Classification (ASJC) codes

  • Computer Science(all)

Cite this

Dwork, C., Smith, A. D., Steinke, T., Ullman, J., & Vadhan, S. (2015). Robust Traceability from Trace Amounts. In Proceedings - 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015 (pp. 650-669). [7354420] (Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS; Vol. 2015-December). IEEE Computer Society. https://doi.org/10.1109/FOCS.2015.46
Dwork, Cynthia ; Smith, Adam Davison ; Steinke, Thomas ; Ullman, Jonathan ; Vadhan, Salil. / Robust Traceability from Trace Amounts. Proceedings - 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015. IEEE Computer Society, 2015. pp. 650-669 (Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS).
@inproceedings{4d211d295eb84d6da06de60a20d3ea85,
title = "Robust Traceability from Trace Amounts",
abstract = "The privacy risks inherent in the release of a large number of summary statistics were illustrated by Homer et al. (PLoS Genetics, 2008), who considered the case of 1-way marginals of SNP allele frequencies obtained in a genome-wide association study: Given a large number of minor allele frequencies from a case group of individuals diagnosed with a particular disease, together with the genomic data of a single target individual and statistics from a sizable reference dataset independently drawn from the same population, an attacker can determine with high confidence whether or not the target is in the case group. In this work we describe and analyze a simple attack that succeeds even if the summary statistics are significantly distorted, whether due to measurement error or noise intentionally introduced to protect privacy. Our attack only requires that the vector of distorted summary statistics is close to the vector of true marginals in L1 norm. Moreover, the reference pool required by previous attacks can be replaced by a single sample drawn from the underlying population. The new attack, which is not specific to genomics and which handles Gaussian as well as Bernouilli data, significantly generalizes recent lower bounds on the noise needed to ensure differential privacy (Bun, Ullman, and Vadhan, STOC 2014, Steinke and Ullman, 2015), obviating the need for the attacker to control the exact distribution of the data.",
author = "Cynthia Dwork and Smith, {Adam Davison} and Thomas Steinke and Jonathan Ullman and Salil Vadhan",
year = "2015",
month = "12",
day = "11",
doi = "10.1109/FOCS.2015.46",
language = "English (US)",
series = "Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS",
publisher = "IEEE Computer Society",
pages = "650--669",
booktitle = "Proceedings - 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015",
address = "United States",

}

Dwork, C, Smith, AD, Steinke, T, Ullman, J & Vadhan, S 2015, Robust Traceability from Trace Amounts. in Proceedings - 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015., 7354420, Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS, vol. 2015-December, IEEE Computer Society, pp. 650-669, 56th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2015, Berkeley, United States, 10/17/15. https://doi.org/10.1109/FOCS.2015.46

Robust Traceability from Trace Amounts. / Dwork, Cynthia; Smith, Adam Davison; Steinke, Thomas; Ullman, Jonathan; Vadhan, Salil.

Proceedings - 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015. IEEE Computer Society, 2015. p. 650-669 7354420 (Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS; Vol. 2015-December).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Robust Traceability from Trace Amounts

AU - Dwork, Cynthia

AU - Smith, Adam Davison

AU - Steinke, Thomas

AU - Ullman, Jonathan

AU - Vadhan, Salil

PY - 2015/12/11

Y1 - 2015/12/11

N2 - The privacy risks inherent in the release of a large number of summary statistics were illustrated by Homer et al. (PLoS Genetics, 2008), who considered the case of 1-way marginals of SNP allele frequencies obtained in a genome-wide association study: Given a large number of minor allele frequencies from a case group of individuals diagnosed with a particular disease, together with the genomic data of a single target individual and statistics from a sizable reference dataset independently drawn from the same population, an attacker can determine with high confidence whether or not the target is in the case group. In this work we describe and analyze a simple attack that succeeds even if the summary statistics are significantly distorted, whether due to measurement error or noise intentionally introduced to protect privacy. Our attack only requires that the vector of distorted summary statistics is close to the vector of true marginals in L1 norm. Moreover, the reference pool required by previous attacks can be replaced by a single sample drawn from the underlying population. The new attack, which is not specific to genomics and which handles Gaussian as well as Bernouilli data, significantly generalizes recent lower bounds on the noise needed to ensure differential privacy (Bun, Ullman, and Vadhan, STOC 2014, Steinke and Ullman, 2015), obviating the need for the attacker to control the exact distribution of the data.

AB - The privacy risks inherent in the release of a large number of summary statistics were illustrated by Homer et al. (PLoS Genetics, 2008), who considered the case of 1-way marginals of SNP allele frequencies obtained in a genome-wide association study: Given a large number of minor allele frequencies from a case group of individuals diagnosed with a particular disease, together with the genomic data of a single target individual and statistics from a sizable reference dataset independently drawn from the same population, an attacker can determine with high confidence whether or not the target is in the case group. In this work we describe and analyze a simple attack that succeeds even if the summary statistics are significantly distorted, whether due to measurement error or noise intentionally introduced to protect privacy. Our attack only requires that the vector of distorted summary statistics is close to the vector of true marginals in L1 norm. Moreover, the reference pool required by previous attacks can be replaced by a single sample drawn from the underlying population. The new attack, which is not specific to genomics and which handles Gaussian as well as Bernouilli data, significantly generalizes recent lower bounds on the noise needed to ensure differential privacy (Bun, Ullman, and Vadhan, STOC 2014, Steinke and Ullman, 2015), obviating the need for the attacker to control the exact distribution of the data.

UR - http://www.scopus.com/inward/record.url?scp=84960325629&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84960325629&partnerID=8YFLogxK

U2 - 10.1109/FOCS.2015.46

DO - 10.1109/FOCS.2015.46

M3 - Conference contribution

AN - SCOPUS:84960325629

T3 - Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS

SP - 650

EP - 669

BT - Proceedings - 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015

PB - IEEE Computer Society

ER -

Dwork C, Smith AD, Steinke T, Ullman J, Vadhan S. Robust Traceability from Trace Amounts. In Proceedings - 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015. IEEE Computer Society. 2015. p. 650-669. 7354420. (Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS). https://doi.org/10.1109/FOCS.2015.46