TY - JOUR
T1 - Gene and network analysis of common variants reveals novel associations in multiple complex diseases
AU - Nakka, Priyanka
AU - Raphael, Benjamin J.
AU - Ramachandran, Sohini
N1 - Funding Information:
We thank the Psychiatric Genomics Consortium, the Genetic Investigation of Anthropometric Traits Consortium, the International Inflammatory Bowel Disease Genetics Consortium, the Diabetes Genetics Replication and Meta-analysis Consortium, and the Broad Institute for making full genomewide association (GWA) P-values data sets available for public download. We also thank Heng Xu, Virginia Perez-Andreu, and Jun J. Yang from the St. Jude Children’s Research Hospital for providing full GWA P-values and genotype data from their multiethnic acute lymphoblastic leukemia GWA study (Xu et al. 2013) and for help with curating the raw genotype data. We gratefully acknowledge Max Leiserson and Jonathan Eldridge for assistance with HotNet2 analysis; Matt Reyna, Julia Palacios, and Lauren A. Sugden for helpful discussions; and Genevieve Wojcik for providing software and help with GWA simulations. We also thank Chris Cotsapas for helpful discussions. B.J.R. is supported by a Career Award at the Scientific Interface from the Burroughs Wellcome Fund, an Alfred P. Sloan Research Fellowship, U.S. National Science Foundation (NSF) grant IIS-1016648, an NSF CAREER award (CCF-1053753), and U.S. National Institutes of Health (NIH) grants R01HG007069 and R01CA180776. P.N. is supported by an Oliver Cromwell Gorton Arnold predoctoral fellowship from Brown University and by NSF CAREER award DBI-1452622 (to S.R.). S.R. is also supported by NIH grant R01GM118652, the Pew Charitable Trusts as a Pew Scholar in the Biomedical Sciences, and an Alfred P. Sloan Research Fellowship.
Publisher Copyright:
© 2016 by the Genetics Society of America.
PY - 2016/10
Y1 - 2016/10
N2 - Genome-wide association (GWA) studies typically lack power to detect genotypes significantly associated with complex diseases, where different causal mutations of small effect may be present across cases. A common, tractable approach for identifying genomic elements associated with complex traits is to evaluate combinations of variants in known pathways or gene sets with shared biological function. Such gene-set analyses require the computation of gene-level P-values or gene scores; these gene scores are also useful when generating hypotheses for experimental validation. However, commonly used methods for generating GWA gene scores are computationally inefficient, biased by gene length, imprecise, or have low true positive rate (TPR) at low false positive rates (FPR), leading to erroneous hypotheses for functional validation. Here we introduce a new method, PEGASUS, for analytically calculating gene scores. PEGASUS produces gene scores with as much as 10 orders of magnitude higher numerical precision than competing methods. In simulation, PEGASUS outperforms existing methods, achieving up to 30% higher TPR when the FPR is fixed at 1%. We use gene scores from PEGASUS as input to HotNet2 to identify networks of interacting genes associated with multiple complex diseases and traits; this is the first application of HotNet2 to common variation. In ulcerative colitis and waist–hip ratio, we discover networks that include genes previously associated with these phenotypes, as well as novel candidate genes. In contrast, existing methods fail to identify these networks. We also identify networks for attention-deficit/hyperactivity disorder, in which GWA studies have yet to identify any significant SNPs.
AB - Genome-wide association (GWA) studies typically lack power to detect genotypes significantly associated with complex diseases, where different causal mutations of small effect may be present across cases. A common, tractable approach for identifying genomic elements associated with complex traits is to evaluate combinations of variants in known pathways or gene sets with shared biological function. Such gene-set analyses require the computation of gene-level P-values or gene scores; these gene scores are also useful when generating hypotheses for experimental validation. However, commonly used methods for generating GWA gene scores are computationally inefficient, biased by gene length, imprecise, or have low true positive rate (TPR) at low false positive rates (FPR), leading to erroneous hypotheses for functional validation. Here we introduce a new method, PEGASUS, for analytically calculating gene scores. PEGASUS produces gene scores with as much as 10 orders of magnitude higher numerical precision than competing methods. In simulation, PEGASUS outperforms existing methods, achieving up to 30% higher TPR when the FPR is fixed at 1%. We use gene scores from PEGASUS as input to HotNet2 to identify networks of interacting genes associated with multiple complex diseases and traits; this is the first application of HotNet2 to common variation. In ulcerative colitis and waist–hip ratio, we discover networks that include genes previously associated with these phenotypes, as well as novel candidate genes. In contrast, existing methods fail to identify these networks. We also identify networks for attention-deficit/hyperactivity disorder, in which GWA studies have yet to identify any significant SNPs.
KW - Common variants
KW - Complex diseases
KW - GWAS
KW - Pathway analysis
KW - Quantitative traits
UR - http://www.scopus.com/inward/record.url?scp=84991492394&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84991492394&partnerID=8YFLogxK
U2 - 10.1534/genetics.116.188391
DO - 10.1534/genetics.116.188391
M3 - Article
C2 - 27489002
AN - SCOPUS:84991492394
SN - 0016-6731
VL - 204
SP - 783
EP - 798
JO - Genetics
JF - Genetics
IS - 2
ER -