TY - GEN
T1 - Private empirical risk minimization
T2 - 55th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2014
AU - Bassily, Raef
AU - Smith, Adam
AU - Thakurta, Abhradeep
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014/12/7
Y1 - 2014/12/7
N2 - Convex empirical risk minimization is a basic tool in machine learning and statistics. We provide new algorithms and matching lower bounds for differentially private convex empirical risk minimization assuming only that each data point's contribution to the loss function is Lipschitz and that the domain of optimization is bounded. We provide a separate set of algorithms and matching lower bounds for the setting in which the loss functions are known to also be strongly convex.Our algorithms run in polynomial time, and in some cases even match the optimal non-private running time (as measured by oracle complexity). We give separate algorithms (and lower bounds) for (ε, 0)-and (ε,δ)-differential privacy, perhaps surprisingly, the techniques used for designing optimal algorithms in the two cases are completely different. Our lower bounds apply even to very simple, smooth function families, such as linear and quadratic functions. This implies that algorithms from previous work can be used to obtain optimal error rates, under the additional assumption that the contributions of each data point to the loss function is smooth. We show that simple approaches to smoothing arbitrary loss functions (in order to apply previous techniques) do not yield optimal error rates. In particular, optimal algorithms were not previously known for problems such as training support vector machines and the high-dimensional median.
AB - Convex empirical risk minimization is a basic tool in machine learning and statistics. We provide new algorithms and matching lower bounds for differentially private convex empirical risk minimization assuming only that each data point's contribution to the loss function is Lipschitz and that the domain of optimization is bounded. We provide a separate set of algorithms and matching lower bounds for the setting in which the loss functions are known to also be strongly convex.Our algorithms run in polynomial time, and in some cases even match the optimal non-private running time (as measured by oracle complexity). We give separate algorithms (and lower bounds) for (ε, 0)-and (ε,δ)-differential privacy, perhaps surprisingly, the techniques used for designing optimal algorithms in the two cases are completely different. Our lower bounds apply even to very simple, smooth function families, such as linear and quadratic functions. This implies that algorithms from previous work can be used to obtain optimal error rates, under the additional assumption that the contributions of each data point to the loss function is smooth. We show that simple approaches to smoothing arbitrary loss functions (in order to apply previous techniques) do not yield optimal error rates. In particular, optimal algorithms were not previously known for problems such as training support vector machines and the high-dimensional median.
UR - http://www.scopus.com/inward/record.url?scp=84920025979&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84920025979&partnerID=8YFLogxK
U2 - 10.1109/FOCS.2014.56
DO - 10.1109/FOCS.2014.56
M3 - Conference contribution
AN - SCOPUS:84920025979
T3 - Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS
SP - 464
EP - 473
BT - Proceedings - Annual IEEE Symposium on Foundations of Computer Science, FOCS
PB - IEEE Computer Society
Y2 - 18 October 2014 through 21 October 2014
ER -