Assessing respondent-driven sampling

Sharad Goel, Matthew J. Salganik

Research output: Contribution to journalArticlepeer-review

287 Scopus citations


Respondent-driven sampling (RDS) is a network-based technique for estimating traits in hard-to-reach populations, for example, the prevalence of HIV among drug injectors. In recent years RDS has been used in more than 120 studies in more than 20 countries and by leading public health organizations, including the Centers for Disease Control and Prevention in the United States. Despite the widespread use and growing popularity of RDS, there has been little empirical validation of the methodology. Here we investigate the performance of RDS by simulating sampling from 85 known, network populations. Across a variety of traits we find that RDS is substantially less accurate than generally acknowledged and that reported RDS confidence intervals are misleadingly narrow. Moreover, because we model a best-case scenario in which the theoretical RDS sampling assumptions hold exactly, it is unlikely that RDS performs any better in practice than in our simulations. Notably, the poor performance of RDS is driven notby the bias but by the high variance of estimates, a possibility that had been largely overlooked in the RDS literature. Given the consistency of our results across networks and our generous sampling conditions, we conclude that RDS as currently practiced may not be suitable for key aspects of public health surveillance where it is now extensively applied.

Original languageEnglish (US)
Pages (from-to)6743-6747
Number of pages5
JournalProceedings of the National Academy of Sciences of the United States of America
Issue number15
StatePublished - Apr 13 2010

All Science Journal Classification (ASJC) codes

  • General


  • Disease surveillance
  • Snowball sampling
  • Social networks


Dive into the research topics of 'Assessing respondent-driven sampling'. Together they form a unique fingerprint.

Cite this