TY - JOUR
T1 - Privacy preserving based logistic regression on big data
AU - Fan, Yongkai
AU - Bai, Jianrong
AU - Lei, Xia
AU - Zhang, Yuqing
AU - Zhang, Bin
AU - Li, Kuan Ching
AU - Tan, Gang
N1 - Funding Information:
This work was partially supported by the National Key R&D Program of China (No. 2018YFB0803700 ), CERNET Innovation Project ( NGII20180406 ), and Fundamental Research Funds for the Central Universities.
Funding Information:
Gang Tan Received his B.E. in Computer Science from Tsinghua University in 1999, and his Ph.D. in Computer Science from Princeton University in 2005. He is an Associate Professor in Penn State University, University Park, USA. He was a recipient of an NSF Career award and won James F. Will Career Development Professorship. He leads the Security of Software (SOS) lab at Penn State. He is interested in methodologies that help create reliable and secure software systems.
Publisher Copyright:
© 2020 Elsevier Ltd
PY - 2020/12/1
Y1 - 2020/12/1
N2 - Cloud computing has strong computing power and huge storage space. Machine learning algorithm, combining with cloud computing, makes the processing of large-scale data practical. Logistic regression algorithm is a widely popular machine learning-based classification algorithm that can be implemented in cloud. However, data privacy cannot be guaranteed in big data processing as privacy leakage of the training data may occur. In order to prevent the privacy leakage of logistic regression algorithm in the cloud and promote the processing efficiency of training data, this paper offers a Privacy Preserving Logistic Regression Algorithm (PPLRA). The homomorphic encryption is used to encrypt the private data when they are uploaded for training. Moreover, the approximation of the Sigmoid function in logistic regression using Taylor's theorem can support the safe calculation using homomorphic encryption. The Experimental results show that PPLRA has significant effects in data privacy preserving, and is more effective in data processing. Comparison with Non-Privacy Preserving Logistic Regression Algorithm (NPPLRA) shows that the computational efficiency is improved by about 1.2 times.
AB - Cloud computing has strong computing power and huge storage space. Machine learning algorithm, combining with cloud computing, makes the processing of large-scale data practical. Logistic regression algorithm is a widely popular machine learning-based classification algorithm that can be implemented in cloud. However, data privacy cannot be guaranteed in big data processing as privacy leakage of the training data may occur. In order to prevent the privacy leakage of logistic regression algorithm in the cloud and promote the processing efficiency of training data, this paper offers a Privacy Preserving Logistic Regression Algorithm (PPLRA). The homomorphic encryption is used to encrypt the private data when they are uploaded for training. Moreover, the approximation of the Sigmoid function in logistic regression using Taylor's theorem can support the safe calculation using homomorphic encryption. The Experimental results show that PPLRA has significant effects in data privacy preserving, and is more effective in data processing. Comparison with Non-Privacy Preserving Logistic Regression Algorithm (NPPLRA) shows that the computational efficiency is improved by about 1.2 times.
UR - http://www.scopus.com/inward/record.url?scp=85091038508&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85091038508&partnerID=8YFLogxK
U2 - 10.1016/j.jnca.2020.102769
DO - 10.1016/j.jnca.2020.102769
M3 - Article
AN - SCOPUS:85091038508
SN - 1084-8045
VL - 171
JO - Journal of Network and Computer Applications
JF - Journal of Network and Computer Applications
M1 - 102769
ER -