Finding the traces of behavioral and cognitive processes in big data and naturally occurring datasets

Alexandra Paxton, Thomas L. Griffiths

Research output: Contribution to journalArticlepeer-review

38 Scopus citations


Today, people generate and store more data than ever before as they interact with both real and virtual environments. These digital traces of behavior and cognition offer cognitive scientists and psychologists an unprecedented opportunity to test theories outside the laboratory. Despite general excitement about big data and naturally occurring datasets among researchers, three “gaps” stand in the way of their wider adoption in theory-driven research: the imagination gap, the skills gap, and the culture gap. We outline an approach to bridging these three gaps while respecting our responsibilities to the public as participants in and consumers of the resulting research. To that end, we introduce Data on the Mind (, a community-focused initiative aimed at meeting the unprecedented challenges and opportunities of theory-driven research with big data and naturally occurring datasets. We argue that big data and naturally occurring datasets are most powerfully used to supplement—not supplant—traditional experimental paradigms in order to understand human behavior and cognition, and we highlight emerging ethical issues related to the collection, sharing, and use of these powerful datasets.

Original languageEnglish (US)
Pages (from-to)1630-1638
Number of pages9
JournalBehavior Research Methods
Issue number5
StatePublished - Oct 1 2017
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Experimental and Cognitive Psychology
  • Developmental and Educational Psychology
  • Arts and Humanities (miscellaneous)
  • Psychology (miscellaneous)
  • General Psychology


  • Big data
  • Data on the Mind
  • Naturally occurring datasets
  • Online experiments
  • Open science


Dive into the research topics of 'Finding the traces of behavioral and cognitive processes in big data and naturally occurring datasets'. Together they form a unique fingerprint.

Cite this