Variance estimation, design effects, and sample size calculations for respondent-driven sampling

Research output: Contribution to journalArticlepeer-review

293 Scopus citations


Hidden populations, such as injection drug users and sex workers, are central to a number of public health problems. However, because of the nature of these groups, it is difficult to collect accurate information about them, and this difficulty complicates disease prevention efforts. A recently developed statistical approach called respondent-driven sampling improves our ability to study hidden populations by allowing researchers to make unbiased estimates of the prevalence of certain traits in these populations. Yet, not enough is known about the sample-to-sample variability of these prevalence estimates. In this paper, we present a bootstrap method for constructing confidence intervals around respondent-driven sampling estimates and demonstrate in simulations that it outperforms the naive method currently in use. We also use simulations and real data to estimate the design effects for respondent-driven sampling in a number of situations. We conclude with practical advice about the power calculations that are needed to determine the appropriate sample size for a study using respondent-driven sampling. In general, we recommend a sample size twice as large as would be needed under simple random sampling.

Original languageEnglish (US)
Pages (from-to)i98-i112
JournalJournal of Urban Health
Issue number7 SUPPL.
StatePublished - Nov 2006

All Science Journal Classification (ASJC) codes

  • Health(social science)
  • Public Health, Environmental and Occupational Health
  • Urban Studies


  • Design effects
  • Hidden populations
  • Power analysis
  • Respondent-driven sampling
  • Sample size
  • Snowball sampling
  • Variance estimation


Dive into the research topics of 'Variance estimation, design effects, and sample size calculations for respondent-driven sampling'. Together they form a unique fingerprint.

Cite this