TY - JOUR
T1 - Learning Latent Factors From Diversified Projections and Its Applications to Over-Estimated and Weak Factors
AU - Fan, Jianqing
AU - Liao, Yuan
N1 - Funding Information:
Jianqing Fan’s research is supported by NSF grants DMS-1662139 and DMS-1712591.
Publisher Copyright:
© 2020 American Statistical Association.
PY - 2022
Y1 - 2022
N2 - Estimations and applications of factor models often rely on the crucial condition that the number of latent factors is consistently estimated, which in turn also requires that factors be relatively strong, data are stationary and weakly serially dependent, and the sample size be fairly large, although in practical applications, one or several of these conditions may fail. In these cases, it is difficult to analyze the eigenvectors of the data matrix. To address this issue, we propose simple estimators of the latent factors using cross-sectional projections of the panel data, by weighted averages with predetermined weights. These weights are chosen to diversify away the idiosyncratic components, resulting in “diversified factors.” Because the projections are conducted cross-sectionally, they are robust to serial conditions, easy to analyze and work even for finite length of time series. We formally prove that this procedure is robust to over-estimating the number of factors, and illustrate it in several applications, including post-selection inference, big data forecasts, large covariance estimation, and factor specification tests. We also recommend several choices for the diversified weights. Supplementary materials for this article are available online.
AB - Estimations and applications of factor models often rely on the crucial condition that the number of latent factors is consistently estimated, which in turn also requires that factors be relatively strong, data are stationary and weakly serially dependent, and the sample size be fairly large, although in practical applications, one or several of these conditions may fail. In these cases, it is difficult to analyze the eigenvectors of the data matrix. To address this issue, we propose simple estimators of the latent factors using cross-sectional projections of the panel data, by weighted averages with predetermined weights. These weights are chosen to diversify away the idiosyncratic components, resulting in “diversified factors.” Because the projections are conducted cross-sectionally, they are robust to serial conditions, easy to analyze and work even for finite length of time series. We formally prove that this procedure is robust to over-estimating the number of factors, and illustrate it in several applications, including post-selection inference, big data forecasts, large covariance estimation, and factor specification tests. We also recommend several choices for the diversified weights. Supplementary materials for this article are available online.
KW - Factor-augmented regression
KW - Large dimensions
KW - Over-estimating the number of factors
KW - Principal components
KW - Random projections
UR - http://www.scopus.com/inward/record.url?scp=85096446792&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85096446792&partnerID=8YFLogxK
U2 - 10.1080/01621459.2020.1831927
DO - 10.1080/01621459.2020.1831927
M3 - Article
AN - SCOPUS:85096446792
SN - 0162-1459
VL - 117
SP - 909
EP - 924
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 538
ER -