Probabilistic Topic Models

Mark Steyvers, Tom Griffiths

Research output: Chapter in Book/Report/Conference proceedingChapter

989 Scopus citations

Abstract

Many chapters in this book illustrate that applying a statistical method such as latent semantic analysis (LSA; Landauer and Dumais, 1997; Landauer, Foltz, and Laham, 1998) to large databases can yield insight into human cognition. The LSA approach makes three claims: that semantic information can be derived from a word-document co-occurrence matrix; that dimensionality reduction is an essential part of this derivation; and that words and documents can be represented as points in Euclidean space. This chapter pursues an approach that is consistent with the first two of these claims, but differs in the third, describing a class of statistical models in which the semantic properties of words and documents are expressed in terms of probabilistic topics.

Original languageEnglish (US)
Title of host publicationHandbook of Latent Semantic Analysis
PublisherTaylor and Francis
Pages427-448
Number of pages22
ISBN (Electronic)9781135603281
ISBN (Print)9780203936399
DOIs
StatePublished - Jan 1 2007
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • General Psychology

Fingerprint

Dive into the research topics of 'Probabilistic Topic Models'. Together they form a unique fingerprint.

Cite this