Designing statistical privacy for your data

Ashwin Machanavajjhala, Daniel Kifer

Research output: Contribution to journalArticle

11 Citations (Scopus)

Abstract

Preparing data for public release requires significant attention to fundamental principles of privacy. If a privacy definition is chosen wisely by the data curator, the sensitive information will be protected. Algorithms that satisfy the spec are called privacy mechanisms. The curator first chooses a privacy definition, then a privacy mechanism satisfying the definition. The curator will run a privacy mechanism on the sensitive data, then grant external users access to the output of privacy mechanism or the sanitized output. The data curator must also consider the effect on privacy when the mechanisms do not satisfy the same privacy definition. One difficulty in designing privacy definitions is accounting for public knowledge of constraints the input database must satisfy. Constraints may correlate the values of different records, arising due to functional dependencies across attributes or prior exact releases of histograms. Correlations arising from constraints provide inference channels attackers could use to learn sensitive information.

Original languageEnglish (US)
Pages (from-to)58-67
Number of pages10
JournalCommunications of the ACM
Volume58
Issue number3
DOIs
StatePublished - Mar 1 2015

All Science Journal Classification (ASJC) codes

  • Computer Science(all)

Cite this

Machanavajjhala, Ashwin ; Kifer, Daniel. / Designing statistical privacy for your data. In: Communications of the ACM. 2015 ; Vol. 58, No. 3. pp. 58-67.
@article{4e509e68126b421ab9855e4d568acf85,
title = "Designing statistical privacy for your data",
abstract = "Preparing data for public release requires significant attention to fundamental principles of privacy. If a privacy definition is chosen wisely by the data curator, the sensitive information will be protected. Algorithms that satisfy the spec are called privacy mechanisms. The curator first chooses a privacy definition, then a privacy mechanism satisfying the definition. The curator will run a privacy mechanism on the sensitive data, then grant external users access to the output of privacy mechanism or the sanitized output. The data curator must also consider the effect on privacy when the mechanisms do not satisfy the same privacy definition. One difficulty in designing privacy definitions is accounting for public knowledge of constraints the input database must satisfy. Constraints may correlate the values of different records, arising due to functional dependencies across attributes or prior exact releases of histograms. Correlations arising from constraints provide inference channels attackers could use to learn sensitive information.",
author = "Ashwin Machanavajjhala and Daniel Kifer",
year = "2015",
month = "3",
day = "1",
doi = "10.1145/2660766",
language = "English (US)",
volume = "58",
pages = "58--67",
journal = "Communications of the ACM",
issn = "0001-0782",
publisher = "Association for Computing Machinery (ACM)",
number = "3",

}

Designing statistical privacy for your data. / Machanavajjhala, Ashwin; Kifer, Daniel.

In: Communications of the ACM, Vol. 58, No. 3, 01.03.2015, p. 58-67.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Designing statistical privacy for your data

AU - Machanavajjhala, Ashwin

AU - Kifer, Daniel

PY - 2015/3/1

Y1 - 2015/3/1

N2 - Preparing data for public release requires significant attention to fundamental principles of privacy. If a privacy definition is chosen wisely by the data curator, the sensitive information will be protected. Algorithms that satisfy the spec are called privacy mechanisms. The curator first chooses a privacy definition, then a privacy mechanism satisfying the definition. The curator will run a privacy mechanism on the sensitive data, then grant external users access to the output of privacy mechanism or the sanitized output. The data curator must also consider the effect on privacy when the mechanisms do not satisfy the same privacy definition. One difficulty in designing privacy definitions is accounting for public knowledge of constraints the input database must satisfy. Constraints may correlate the values of different records, arising due to functional dependencies across attributes or prior exact releases of histograms. Correlations arising from constraints provide inference channels attackers could use to learn sensitive information.

AB - Preparing data for public release requires significant attention to fundamental principles of privacy. If a privacy definition is chosen wisely by the data curator, the sensitive information will be protected. Algorithms that satisfy the spec are called privacy mechanisms. The curator first chooses a privacy definition, then a privacy mechanism satisfying the definition. The curator will run a privacy mechanism on the sensitive data, then grant external users access to the output of privacy mechanism or the sanitized output. The data curator must also consider the effect on privacy when the mechanisms do not satisfy the same privacy definition. One difficulty in designing privacy definitions is accounting for public knowledge of constraints the input database must satisfy. Constraints may correlate the values of different records, arising due to functional dependencies across attributes or prior exact releases of histograms. Correlations arising from constraints provide inference channels attackers could use to learn sensitive information.

UR - http://www.scopus.com/inward/record.url?scp=84923639680&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84923639680&partnerID=8YFLogxK

U2 - 10.1145/2660766

DO - 10.1145/2660766

M3 - Article

AN - SCOPUS:84923639680

VL - 58

SP - 58

EP - 67

JO - Communications of the ACM

JF - Communications of the ACM

SN - 0001-0782

IS - 3

ER -