Statistical privacy is the art of releasing the datasets that provide useful information about population trends without revealing private information about any individual. Recent high-profile attacks on datasets released by AOL and Netflix demonstrate the need for rigorous application-specific privacy definitions to guide the anonymization of data. The goal of this project is to develop modular components, called privacy axioms, that can be chained together to create customized privacy definitions and anonymized data for statistical privacy applications. Such modularity can enable data curators without extensive expertise in statistical privacy to release anonymized data while providing privacy guarantees that are more interpretable and reliable.
Intellectual merit: this project is designed to provide a unifying framework for statistical privacy that can bring about a deeper understanding of privacy issues and provide guidance for the safe anonymization and release of sensitive data. In addition to theoretical developments, this research plan also targets specific existing applications at Penn State and the U.S. Census Bureau.
Broader impact: the systematic approach to privacy pursued by this project can enable access to and analysis of anonymized data in domains where access to data is otherwise heavily restricted.
This project aims to build upon the investigator's prior experience with outreach programs such as the Summer Research Opportunities Program (SROP) by involving undergraduates in the proposed research. To prepare students for future work that requires analysis of anonymized data, this research is also being integrated into machine learning courses at Penn State.
For further information see the project web site at the URL:
|Effective start/end date||2/1/11 → 12/31/17|
- National Science Foundation: $437,501.00