Abstract
Estimations and applications of factor models often rely on the crucial condition that the number of latent factors is consistently estimated, which in turn also requires that factors be relatively strong, data are stationary and weakly serially dependent, and the sample size be fairly large, although in practical applications, one or several of these conditions may fail. In these cases, it is difficult to analyze the eigenvectors of the data matrix. To address this issue, we propose simple estimators of the latent factors using cross-sectional projections of the panel data, by weighted averages with predetermined weights. These weights are chosen to diversify away the idiosyncratic components, resulting in “diversified factors.” Because the projections are conducted cross-sectionally, they are robust to serial conditions, easy to analyze and work even for finite length of time series. We formally prove that this procedure is robust to over-estimating the number of factors, and illustrate it in several applications, including post-selection inference, big data forecasts, large covariance estimation, and factor specification tests. We also recommend several choices for the diversified weights. Supplementary materials for this article are available online.
Original language | English (US) |
---|---|
Pages (from-to) | 909-924 |
Number of pages | 16 |
Journal | Journal of the American Statistical Association |
Volume | 117 |
Issue number | 538 |
DOIs | |
State | Published - 2022 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Statistics, Probability and Uncertainty
Keywords
- Factor-augmented regression
- Large dimensions
- Over-estimating the number of factors
- Principal components
- Random projections