TY - JOUR
T1 - Estimating false discovery proportion under arbitrary covariance dependence
AU - Fan, Jianqing
AU - Han, Xu
AU - Gu, Weijie
N1 - Funding Information:
Jianqing Fan is Frederick L. Moore’18 Professor, Department of Operations Research & Financial Engineering, Princeton University, Princeton, NJ 08544, USA and Honorary Professor, School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China (E-mail: [email protected]). Xu Han is Assistant Professor, Department of Statistics, Fox Business School, Temple University, Philadelphia, PA 19122 (E-mail: [email protected]). Weijie Gu is graduate student, Department of Operations Research & Financial Engineering, Princeton University, Princeton, NJ 08544, USA (E-mail: [email protected]). The article was completed while Xu Han was a postdoctoral fellow at Princeton University. This research was partly supported by NSF grants DMS-0704337 and DMS-0714554 and NIH grant R01-GM072611. The authors are grateful to the editor, associate editor, and referees for helpful comments.
PY - 2012
Y1 - 2012
N2 - Multiple hypothesis testing is a fundamental problem in high-dimensional inference, with wide applications in many scientific fields. In genome-wide association studies, tens of thousands of tests are performed simultaneously to find if any single-nucleotide polymorphisms (SNPs) are associated with some traits and those tests are correlated. When test statistics are correlated, false discovery control becomes very challenging under arbitrary dependence. In this article, we propose a novel method-based on principal factor approximation-that successfully subtracts the common dependence and weakens significantly the correlation structure, to deal with an arbitrary dependence structure. We derive an approximate expression for false discovery proportion (FDP) in large-scale multiple testing when a common threshold is used and provide a consistent estimate of realized FDP. This result has important applications in controlling false discovery rate and FDP. Our estimate of realized FDP compares favorably with Efron's approach, as demonstrated in the simulated examples. Our approach is further illustrated by some real data applications. We also propose a dependence-adjusted procedure that is more powerful than the fixed-threshold procedure. Supplementary material for this article is available online.
AB - Multiple hypothesis testing is a fundamental problem in high-dimensional inference, with wide applications in many scientific fields. In genome-wide association studies, tens of thousands of tests are performed simultaneously to find if any single-nucleotide polymorphisms (SNPs) are associated with some traits and those tests are correlated. When test statistics are correlated, false discovery control becomes very challenging under arbitrary dependence. In this article, we propose a novel method-based on principal factor approximation-that successfully subtracts the common dependence and weakens significantly the correlation structure, to deal with an arbitrary dependence structure. We derive an approximate expression for false discovery proportion (FDP) in large-scale multiple testing when a common threshold is used and provide a consistent estimate of realized FDP. This result has important applications in controlling false discovery rate and FDP. Our estimate of realized FDP compares favorably with Efron's approach, as demonstrated in the simulated examples. Our approach is further illustrated by some real data applications. We also propose a dependence-adjusted procedure that is more powerful than the fixed-threshold procedure. Supplementary material for this article is available online.
KW - Arbitrary dependence structure
KW - False discovery rate
KW - Genome-wide association studies
KW - High-dimensional inference
KW - Multiple hypothesis testing
UR - http://www.scopus.com/inward/record.url?scp=84870709284&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84870709284&partnerID=8YFLogxK
U2 - 10.1080/01621459.2012.720478
DO - 10.1080/01621459.2012.720478
M3 - Article
C2 - 24729644
AN - SCOPUS:84870709284
SN - 0162-1459
VL - 107
SP - 1019
EP - 1035
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 499
ER -