Designing statistical privacy for your data

Ashwin Machanavajjhala, Daniel Kifer

Research output: Contribution to journalArticlepeer-review

17 Scopus citations


Preparing data for public release requires significant attention to fundamental principles of privacy. If a privacy definition is chosen wisely by the data curator, the sensitive information will be protected. Algorithms that satisfy the spec are called privacy mechanisms. The curator first chooses a privacy definition, then a privacy mechanism satisfying the definition. The curator will run a privacy mechanism on the sensitive data, then grant external users access to the output of privacy mechanism or the sanitized output. The data curator must also consider the effect on privacy when the mechanisms do not satisfy the same privacy definition. One difficulty in designing privacy definitions is accounting for public knowledge of constraints the input database must satisfy. Constraints may correlate the values of different records, arising due to functional dependencies across attributes or prior exact releases of histograms. Correlations arising from constraints provide inference channels attackers could use to learn sensitive information.

Original languageEnglish (US)
Pages (from-to)58-67
Number of pages10
JournalCommunications of the ACM
Issue number3
StatePublished - Mar 1 2015

All Science Journal Classification (ASJC) codes

  • Computer Science(all)

Cite this