TY - JOUR
T1 - Do machine learning methods outperform traditional statistical models in crime prediction? A comparison between logistic regression and neural networks
AU - Na, Chongmin
AU - Oh, Gyeongseok
AU - Song, Juyoung
AU - Park, Hyoungah
N1 - Publisher Copyright:
© 2021, Seoul National University - Graduate School of Public Administration. All rights reserved.
PY - 2021
Y1 - 2021
N2 - Although machine learning (ML) methods have recently gained popularity in both academia and industry as alternative risk assessment tools for efficient decision-making, inconsistent patterns are observed in the existing literature regarding their competitiveness and utility in predicting various outcomes. Drawing on a sample of the general youth population in the U.S., we compared the predictive accuracy of logistic regression (LR) and neural networks (NNs), which are the most widely applied approaches in conventional statistics and contemporary ML methods, respectively, by adopting many theoretically relevant predictors of the future arrest outcome. Even after fully implementing rigorous ML protocols for model tuning and up-sampling and down-sampling procedures recommended in recent literature to optimize learning algorithms, NNs did not yield substantially improved performance over LR if we still rely on a conventional dataset with relatively small sample sizes and a limited number of predictors. Nonetheless, we encourage more rigorous, comprehensive, and diverse evaluation research for a complete understanding of the ML potential in predictive capacity and the contingencies in which modern ML methods can perform better than conventional parametric statistical models.
AB - Although machine learning (ML) methods have recently gained popularity in both academia and industry as alternative risk assessment tools for efficient decision-making, inconsistent patterns are observed in the existing literature regarding their competitiveness and utility in predicting various outcomes. Drawing on a sample of the general youth population in the U.S., we compared the predictive accuracy of logistic regression (LR) and neural networks (NNs), which are the most widely applied approaches in conventional statistics and contemporary ML methods, respectively, by adopting many theoretically relevant predictors of the future arrest outcome. Even after fully implementing rigorous ML protocols for model tuning and up-sampling and down-sampling procedures recommended in recent literature to optimize learning algorithms, NNs did not yield substantially improved performance over LR if we still rely on a conventional dataset with relatively small sample sizes and a limited number of predictors. Nonetheless, we encourage more rigorous, comprehensive, and diverse evaluation research for a complete understanding of the ML potential in predictive capacity and the contingencies in which modern ML methods can perform better than conventional parametric statistical models.
UR - http://www.scopus.com/inward/record.url?scp=85112186490&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85112186490&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:85112186490
VL - 36
SP - 1
EP - 13
JO - Korean Journal of Policy Studies
JF - Korean Journal of Policy Studies
SN - 1225-5017
IS - 1
ER -