Estimating the size of hard-to-reach populations is an important problem for many fields. The network scale-up method (NSUM) is a relatively new approach to estimate the size of these hard-to-reach populations by asking respondents the question, “How many X’s do you know,” where X is the population of interest (e.g., “How many female sex workers do you know?”). The answers to these questions form aggregated relational data (ARD). The NSUM has been used to estimate the size of a variety of subpopulations, including female sex workers, drug users, and even children who have been hospitalized for choking. Within the network scale-up methodology, there are a multitude of estimators for the size of the hidden population, including direct estimators, maximum likelihood estimators, and Bayesian estimators. In this article, we first provide an in-depth analysis of ARD properties and the techniques to collect the data. Then, we comprehensively review different estimation methods in terms of the assumptions behind each model, the relationships between the estimators, and the practical considerations of implementing the methods. We apply many of the models discussed in the review to one canonical dataset and compare their performance and unique features, presented in the supplementary materials. Finally, we provide a summary of the dominant methods and an extensive list of the applications, and discuss the open problems and potential research directions in this area.
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Statistics, Probability and Uncertainty