TY - JOUR

T1 - The optimal discovery procedure

T2 - A new approach to simultaneous significance testing

AU - Storey, John D.

N1 - Funding Information:
We thank Amanda Blackford and Dr. Sining Cheng for statistical advice. FJC was supported by the Breast Cancer Research Foundation, the American Cancer Society and the Mayo Clinic Breast Cancer SPORE (CA116201). AS was supported by NIH awards U01 GM61390 and R01 GM 54762, and the Sandler Family Supporting Foundation, as well as computing hardware gifts from IBM, Intel, HP, and NetApps. MB was supported by the American Cancer Society and the NIH Roadmap K12 program. (Grant Number KL2 RR024130 from the National Center for Research Resources (NCRR)). The content is solely the responsibility of the authors and does not necessarily represent the official view of the NCRR or the National Institutes of Health.

PY - 2007/6

Y1 - 2007/6

N2 - The Neyman-Pearson lemma provides a simple procedure for optimally testing a single hypothesis when the null and alternative distributions are known. This result has played a major role in the development of significance testing strategies that are used in practice. Most of the work extending single-testing strategies to multiple tests has focused on formulating and estimating new types of significance measures, such as the false discovery rate. These methods tend to be based on p-values that are calculated from each test individually, ignoring information from the other tests. I show here that one can improve the overall performance of multiple significance tests by borrowing information across all the tests when assessing the relative significance of each one, rather than calculating p-values for each test individually. The 'optimal discovery procedure' is introduced, which shows how to maximize the number of expected true positive results for each fixed number of expected false positive results. The optimality that is achieved by this procedure is shown to be closely related to optimality in terms of the false discovery rate. The optimal discovery procedure motivates a new approach to testing multiple hypotheses, especially when the tests are related. As a simple example, a new simultaneous procedure for testing several normal means is defined; this is surprisingly demonstrated to outperform the optimal single-test procedure, showing that a method which is optimal for single tests may no longer be optimal for multiple tests. Connections to other concepts in statistics are discussed, including Stein's paradox, shrinkage estimation and the Bayesian approach to hypothesis testing.

AB - The Neyman-Pearson lemma provides a simple procedure for optimally testing a single hypothesis when the null and alternative distributions are known. This result has played a major role in the development of significance testing strategies that are used in practice. Most of the work extending single-testing strategies to multiple tests has focused on formulating and estimating new types of significance measures, such as the false discovery rate. These methods tend to be based on p-values that are calculated from each test individually, ignoring information from the other tests. I show here that one can improve the overall performance of multiple significance tests by borrowing information across all the tests when assessing the relative significance of each one, rather than calculating p-values for each test individually. The 'optimal discovery procedure' is introduced, which shows how to maximize the number of expected true positive results for each fixed number of expected false positive results. The optimality that is achieved by this procedure is shown to be closely related to optimality in terms of the false discovery rate. The optimal discovery procedure motivates a new approach to testing multiple hypotheses, especially when the tests are related. As a simple example, a new simultaneous procedure for testing several normal means is defined; this is surprisingly demonstrated to outperform the optimal single-test procedure, showing that a method which is optimal for single tests may no longer be optimal for multiple tests. Connections to other concepts in statistics are discussed, including Stein's paradox, shrinkage estimation and the Bayesian approach to hypothesis testing.

KW - Classification

KW - False discovery rate

KW - Multiple-hypothesis testing

KW - Optimal discovery procedure

KW - Q-value

KW - Single-thresholding procedure

UR - http://www.scopus.com/inward/record.url?scp=34249111910&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34249111910&partnerID=8YFLogxK

U2 - 10.1111/j.1467-9868.2007.005592.x

DO - 10.1111/j.1467-9868.2007.005592.x

M3 - Article

AN - SCOPUS:34249111910

SN - 1369-7412

VL - 69

SP - 347

EP - 368

JO - Journal of the Royal Statistical Society. Series B: Statistical Methodology

JF - Journal of the Royal Statistical Society. Series B: Statistical Methodology

IS - 3

ER -