## Abstract

Hypothesis testing in the linear regression model is a fundamental statistical problem. We consider linear regression in the high dimensional regime where the number of parameters exceeds the number of samples (p>n). To make informative inference, we assume that the model is approximately sparse, i.e. the effect of covariates on the response can be well approximated by conditioning on a relatively small number of covariates whose identities are unknown. We develop a framework for testing very general hypotheses regarding the model parameters. Our framework encompasses testing whether the parameter lies in a convex cone, testing the signal strength, and testing arbitrary functionals of the parameter. We show that the procedure proposed controls the type I error, and we also analyse the power of the procedure. Our numerical experiments confirm our theoretical findings and demonstrate that we control the false positive rate (type I error) near the nominal level and have high power. By duality between hypotheses testing and confidence intervals, the framework proposed can be used to obtain valid confidence intervals for various functionals of the model parameters. For linear functionals, the length of confidence intervals is shown to be minimax rate optimal.

Original language | English (US) |
---|---|

Pages (from-to) | 685-718 |

Number of pages | 34 |

Journal | Journal of the Royal Statistical Society. Series B: Statistical Methodology |

Volume | 82 |

Issue number | 3 |

DOIs | |

State | Published - Jul 1 2020 |

## All Science Journal Classification (ASJC) codes

- Statistics and Probability
- Statistics, Probability and Uncertainty

## Keywords

- Bias
- Confidence intervals
- False positive rate
- High dimensional inference
- Hypothesis testing
- Statistical power