TY - JOUR
T1 - Network Sampling with Memory
T2 - A Proposal for More Efficient Sampling from Social Networks
AU - Mouw, Ted
AU - Verdery, Ashton M.
N1 - Funding Information:
The authors disclosed receipt of the following financial support for the research and/or authorship of this article: The authors are grateful for the support provided by the Carolina Population Center of the University of North Carolina, Chapel Hill to both Mouw and Verdery that aided this research. This paper uses data from Add Health, a program project designed by J. Richard Udry, Peter S. Bearman, and Kathleen Mullan Harris, and funded by a grant P01-HD31921 from the National Institute of Child Health and Human Development, with cooperative funding from 17 other agencies.
PY - 2012/8
Y1 - 2012/8
N2 - Techniques for sampling from networks have grown into an important area of research across several fields. For sociologists, the possibility of sampling from a network is appealing for two reasons: (1) A network sample can yield substantively interesting data about network structures and social interactions, and (2) it is useful in situations in which study populations are difficult or impossible to survey with traditional sampling approaches because of the lack of a sampling frame. Despite its appeal, methodological concerns about the precision and accuracy of network-based sampling methods remain. In particular, recent research has shown that sampling from a network using a random walk–based approach such as respondent-driven sampling (RDS) can result in a high design effect (DE): the ratio of the sampling variance to the sampling variance of simple random sampling (SRS). A high DE means that more cases must be collected to achieve the same level of precision as SRS. In this article, we propose an alternative strategy, network sampling with memory (NSM), which collects network data from respondents to reduce DEs and, correspondingly, the number of interviews needed to achieve a given level of statistical power. NSM combines a “list” mode, in which all individuals on the revealed network list are sampled with the same cumulative probability, with a “search” mode, which gives priority to bridge nodes connecting the current sample to unexplored parts of the network. We test the relative efficiency of NSM compared with RDS and SRS on 162 school and university networks from the National Longitudinal Study of Adolescent Health and Facebook that range in size from 110 to 16,278 nodes. The results show that the average DE for NSM on these 162 networks is 1.16, which is very close to the efficiency of a simple random sample (DE = 1) and 98.5 percent lower than the average DE we observed for RDS.
AB - Techniques for sampling from networks have grown into an important area of research across several fields. For sociologists, the possibility of sampling from a network is appealing for two reasons: (1) A network sample can yield substantively interesting data about network structures and social interactions, and (2) it is useful in situations in which study populations are difficult or impossible to survey with traditional sampling approaches because of the lack of a sampling frame. Despite its appeal, methodological concerns about the precision and accuracy of network-based sampling methods remain. In particular, recent research has shown that sampling from a network using a random walk–based approach such as respondent-driven sampling (RDS) can result in a high design effect (DE): the ratio of the sampling variance to the sampling variance of simple random sampling (SRS). A high DE means that more cases must be collected to achieve the same level of precision as SRS. In this article, we propose an alternative strategy, network sampling with memory (NSM), which collects network data from respondents to reduce DEs and, correspondingly, the number of interviews needed to achieve a given level of statistical power. NSM combines a “list” mode, in which all individuals on the revealed network list are sampled with the same cumulative probability, with a “search” mode, which gives priority to bridge nodes connecting the current sample to unexplored parts of the network. We test the relative efficiency of NSM compared with RDS and SRS on 162 school and university networks from the National Longitudinal Study of Adolescent Health and Facebook that range in size from 110 to 16,278 nodes. The results show that the average DE for NSM on these 162 networks is 1.16, which is very close to the efficiency of a simple random sample (DE = 1) and 98.5 percent lower than the average DE we observed for RDS.
UR - http://www.scopus.com/inward/record.url?scp=84993791807&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84993791807&partnerID=8YFLogxK
U2 - 10.1177/0081175012461248
DO - 10.1177/0081175012461248
M3 - Article
AN - SCOPUS:84993791807
VL - 42
SP - 206
EP - 256
JO - Sociological Methodology
JF - Sociological Methodology
SN - 0081-1750
IS - 1
ER -