TY - JOUR
T1 - Inferring sparse structure in genotype-phenotype maps
AU - Petti, Samantha
AU - Reddy, Gautam
AU - Desai, Michael M.
N1 - Publisher Copyright:
© 2023 The Author(s). Published by Oxford University Press on behalf of The Genetics Society of America. All rights reserved.
PY - 2023/9
Y1 - 2023/9
N2 - Correlation among multiple phenotypes across related individuals may reflect some pattern of shared genetic architecture: individual genetic loci affect multiple phenotypes (an effect known as pleiotropy), creating observable relationships between phenotypes. A natural hypothesis is that pleiotropic effects reflect a relatively small set of common "core"cellular processes: each genetic locus affects one or a few core processes, and these core processes in turn determine the observed phenotypes. Here, we propose a method to infer such structure in genotype-phenotype data. Our approach, sparse structure discovery (SSD) is based on a penalized matrix decomposition designed to identify latent structure that is low-dimensional (many fewer core processes than phenotypes and genetic loci), locus-sparse (each locus affects few core processes), and/or phenotype-sparse (each phenotype is influenced by few core processes). Our use of sparsity as a guide in the matrix decomposition is motivated by the results of a novel empirical test indicating evidence of sparse structure in several recent genotype-phenotype datasets. First, we use synthetic data to show that our SSD approach can accurately recover core processes if each genetic locus affects few core processes or if each phenotype is affected by few core processes. Next, we apply the method to three datasets spanning adaptive mutations in yeast, genotoxin robustness assay in human cell lines, and genetic loci identified from a yeast cross, and evaluate the biological plausibility of the core process identified. More generally, we propose sparsity as a guiding prior for resolving latent structure in empirical genotype-phenotype maps.
AB - Correlation among multiple phenotypes across related individuals may reflect some pattern of shared genetic architecture: individual genetic loci affect multiple phenotypes (an effect known as pleiotropy), creating observable relationships between phenotypes. A natural hypothesis is that pleiotropic effects reflect a relatively small set of common "core"cellular processes: each genetic locus affects one or a few core processes, and these core processes in turn determine the observed phenotypes. Here, we propose a method to infer such structure in genotype-phenotype data. Our approach, sparse structure discovery (SSD) is based on a penalized matrix decomposition designed to identify latent structure that is low-dimensional (many fewer core processes than phenotypes and genetic loci), locus-sparse (each locus affects few core processes), and/or phenotype-sparse (each phenotype is influenced by few core processes). Our use of sparsity as a guide in the matrix decomposition is motivated by the results of a novel empirical test indicating evidence of sparse structure in several recent genotype-phenotype datasets. First, we use synthetic data to show that our SSD approach can accurately recover core processes if each genetic locus affects few core processes or if each phenotype is affected by few core processes. Next, we apply the method to three datasets spanning adaptive mutations in yeast, genotoxin robustness assay in human cell lines, and genetic loci identified from a yeast cross, and evaluate the biological plausibility of the core process identified. More generally, we propose sparsity as a guiding prior for resolving latent structure in empirical genotype-phenotype maps.
KW - genotype-phenotype map
KW - penalized matrix decomposition
KW - sparsity
KW - structure discovery
UR - http://www.scopus.com/inward/record.url?scp=85169502349&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85169502349&partnerID=8YFLogxK
U2 - 10.1093/genetics/iyad127
DO - 10.1093/genetics/iyad127
M3 - Article
C2 - 37437111
AN - SCOPUS:85169502349
SN - 0016-6731
VL - 225
JO - Genetics
JF - Genetics
IS - 1
M1 - iyad127
ER -