Learning Latent Factors From Diversified Projections and Its Applications to Over-Estimated and Weak Factors

Jianqing Fan, Yuan Liao

Research output: Contribution to journalArticlepeer-review

4 Scopus citations


Estimations and applications of factor models often rely on the crucial condition that the number of latent factors is consistently estimated, which in turn also requires that factors be relatively strong, data are stationary and weakly serially dependent, and the sample size be fairly large, although in practical applications, one or several of these conditions may fail. In these cases, it is difficult to analyze the eigenvectors of the data matrix. To address this issue, we propose simple estimators of the latent factors using cross-sectional projections of the panel data, by weighted averages with predetermined weights. These weights are chosen to diversify away the idiosyncratic components, resulting in “diversified factors.” Because the projections are conducted cross-sectionally, they are robust to serial conditions, easy to analyze and work even for finite length of time series. We formally prove that this procedure is robust to over-estimating the number of factors, and illustrate it in several applications, including post-selection inference, big data forecasts, large covariance estimation, and factor specification tests. We also recommend several choices for the diversified weights. Supplementary materials for this article are available online.

Original languageEnglish (US)
Pages (from-to)909-924
Number of pages16
JournalJournal of the American Statistical Association
Issue number538
StatePublished - 2022
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty


  • Factor-augmented regression
  • Large dimensions
  • Over-estimating the number of factors
  • Principal components
  • Random projections


Dive into the research topics of 'Learning Latent Factors From Diversified Projections and Its Applications to Over-Estimated and Weak Factors'. Together they form a unique fingerprint.

Cite this