75 Scopus citations


Most papers on high-dimensional statistics are based on the assumption that none of the regressors are correlated with the regression error, namely, they are exogenous. Yet, endogeneity can arise incidentally from a large pool of regressors in a high-dimensional regression. This causes the inconsistency of the penalized least-squares method and possible false scientific discoveries. A necessary condition for model selection consistency of a general class of penalized regression methods is given, which allows us to prove formally the inconsistency claim. To cope with the incidental endogeneity, we construct a novel penalized focused generalized method of moments (FGMM) criterion function. The FGMM effectively achieves the dimension reduction and applies the instrumental variable methods. We show that it possesses the oracle property even in the presence of endogenous predictors, and that the solution is also near global minimum under the over-identification assumption. Finally, we also show how the semi-parametric efficiency of estimation can be achieved via a two-step approach.

Original languageEnglish (US)
Pages (from-to)872-917
Number of pages46
JournalAnnals of Statistics
Issue number3
StatePublished - Jun 2014

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty


  • Conditional moment restriction
  • Endogenous variables
  • Estimating equation
  • Focused GMM
  • Global minimization
  • Oracle property
  • Over identification
  • Semiparametric efficiency
  • Sparsity recovery


Dive into the research topics of 'Endogeneity in high dimensions'. Together they form a unique fingerprint.

Cite this