Identifying Homeless Youth At-Risk of Substance Use Disorder: Data-Driven Insights for Policymakers

Maryam Tabar, Heesoo Park, Stephanie Winkler, Dongwon Lee, Anamika Barman-Adhikari, Amulya Yadav

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Substance Use Disorder (SUD) is a devastating disease that leads to significant mental and behavioral impairments. Its negative effects damage the homeless youth population more severely (as compared to stably housed counterparts) because of their high-risk behaviors. To assist policymakers in devising effective and accurate long-term strategies to mitigate SUD, it is necessary to critically analyze environmental, psychological, and other factors associated with SUD among homeless youth. Unfortunately, there is no definitive data-driven study on analyzing factors associated with SUD among homeless youth. While there have been a few prior studies in the past, they (i) do not analyze variation in the associated factors for SUD with geographical heterogeneity in their studies; and (ii) only consider a few contributing factors to SUD in relatively small samples. This work aims to fill this gap by making the following three contributions: (i) we use a real-world dataset collected from ∼1,400 homeless youth (across six American states) to build accurate Machine Learning (ML) models for predicting the susceptibility of homeless youth to SUD; (ii) we find a representative set of factors associated with SUD among this population by analyzing feature importance values associated with our ML models; and (iii) we investigate the effect of geographical heterogeneity on the factors associated with SUD. Our results show that our system using adaptively boosted decision trees achieves the best predictive accuracy out of several algorithms on the SUD prediction task, achieving an Area Under the ROC Curve of 0.85. Further, among other things, we also find that both Post-Traumatic Stress Disorder (PTSD) and depression are very strongly associated with SUD among homeless youth because of their propensity to self-medicate to alleviate stress. This work is done in collaboration with social work scientists, who are currently evaluating the results for potential future deployment.

Original languageEnglish (US)
Title of host publicationKDD 2020 - Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages3092-3100
Number of pages9
ISBN (Electronic)9781450379984
DOIs
StatePublished - Aug 23 2020
Event26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2020 - Virtual, Online, United States
Duration: Aug 23 2020Aug 27 2020

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Conference

Conference26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2020
CountryUnited States
CityVirtual, Online
Period8/23/208/27/20

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems

Fingerprint Dive into the research topics of 'Identifying Homeless Youth At-Risk of Substance Use Disorder: Data-Driven Insights for Policymakers'. Together they form a unique fingerprint.

Cite this