How many people do you know in prison? Using overdispersion in count data to estimate social structure in networks

Tian Zheng, Matthew J. Salganik, Andrew Gelman

Research output: Contribution to journalArticlepeer-review

120 Scopus citations


Networks - sets of objects connected by relationships - are important in a number of fields. The study of networks has long been central to sociology, where researchers have attempted to understand the causes and consequences of the structure of relationships in large groups of people. Using insight from previous network research, Killworth et al. and McCarty et al. have developed and evaluated a method for estimating the sizes of hard-to-count populations using network data collected from a simple random sample of Americans. In this article we show how, using a multilevel overdispersed Poisson regression model, these data also can be used to estimate aspects of social structure in the population. Our work goes beyond most previous research on networks by using variation, as well as average responses, as a source of information. We apply our method to the data of McCarty et al. and find that Americans vary greatly in their number of acquaintances. Further, Americans show great variation in propensity to form ties to people in some groups (e.g., males in prison, the homeless, and American Indians), but little variation for other groups (e.g., twins, people named Michael or Nicole). We also explore other features of these data and consider ways in which survey data can be used to estimate network structure.

Original languageEnglish (US)
Pages (from-to)409-423
Number of pages15
JournalJournal of the American Statistical Association
Issue number474
StatePublished - Jun 2006

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty


  • Negative binomial distribution
  • Overdispersion
  • Sampling
  • Social networks
  • Social structure


Dive into the research topics of 'How many people do you know in prison? Using overdispersion in count data to estimate social structure in networks'. Together they form a unique fingerprint.

Cite this