Abstract
In classical statistics, much thought has been put into experimental design and data collection. In the high-dimensional setting, however, experimental design has been less of a focus. In this paper, we stress the importance of collecting multiple replicates for each subject in the high-dimensional setting. We consider learning the structure of a graphical model with latent variables, under the assumption that these variables take a constant value across replicates within each subject. By collecting multiple replicates for each subject, we can estimate the conditional dependence relationships among the observed variables given the latent variables. To test the hypothesis of conditional independence between two observed variables, we propose a pairwise decorrelated score test. Theoretical guarantees are established for parameter estimation and for this test. We show that our method is able to estimate latent variable graphical models more accurately than some existing methods, and we apply it to a brain imaging dataset.
Original language | English (US) |
---|---|
Pages (from-to) | 761-777 |
Number of pages | 17 |
Journal | Biometrika |
Volume | 103 |
Issue number | 4 |
DOIs | |
State | Published - Dec 1 2016 |
All Science Journal Classification (ASJC) codes
- Applied Mathematics
- Agricultural and Biological Sciences (miscellaneous)
- General Agricultural and Biological Sciences
- Statistics and Probability
- Statistics, Probability and Uncertainty
- General Mathematics
Keywords
- Experimental design
- Nuisance parameter
- Pairwise decorrelated score test
- Semiparametric exponential family graphical model