TY - JOUR
T1 - MaLAdapt Reveals Novel Targets of Adaptive Introgression From Neanderthals and Denisovans in Worldwide Human Populations
AU - Zhang, Xinjun
AU - Kim, Bernard
AU - Singh, Armaan
AU - Sankararaman, Sriram
AU - Durvasula, Arun
AU - Lohmueller, Kirk E.
N1 - Publisher Copyright:
© 2023 The Author(s). Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution.
PY - 2023/1/1
Y1 - 2023/1/1
N2 - Adaptive introgression (AI) facilitates local adaptation in a wide range of species. Many state-of-the-art methods detect AI with ad-hoc approaches that identify summary statistic outliers or intersect scans for positive selection with scans for introgressed genomic regions. Although widely used, approaches intersecting outliers are vulnerable to a high false-negative rate as the power of different methods varies, especially for complex introgression events. Moreover, population genetic processes unrelated to AI, such as background selection or heterosis, may create similar genomic signals to AI, compromising the reliability of methods that rely on neutral null distributions. In recent years, machine learning (ML) methods have been increasingly applied to population genetic questions. Here, we present a ML-based method called MaLAdapt for identifying AI loci from genome-wide sequencing data. Using an Extra-Trees Classifier algorithm, our method combines information from a large number of biologically meaningful summary statistics to capture a powerful composite signature of AI across the genome. In contrast to existing methods, MaLAdapt is especially well-powered to detect AI with mild beneficial effects, including selection on standing archaic variation, and is robust to non-AI selective sweeps, heterosis from deleterious mutations, and demographic misspecification. Furthermore, MaLAdapt outperforms existing methods for detecting AI based on the analysis of simulated data and the validation of empirical signals through visual inspection of haplotype patterns. We apply MaLAdapt to the 1000 Genomes Project human genomic data and discover novel AI candidate regions in non-African populations, including genes that are enriched in functionally important biological pathways regulating metabolism and immune responses.
AB - Adaptive introgression (AI) facilitates local adaptation in a wide range of species. Many state-of-the-art methods detect AI with ad-hoc approaches that identify summary statistic outliers or intersect scans for positive selection with scans for introgressed genomic regions. Although widely used, approaches intersecting outliers are vulnerable to a high false-negative rate as the power of different methods varies, especially for complex introgression events. Moreover, population genetic processes unrelated to AI, such as background selection or heterosis, may create similar genomic signals to AI, compromising the reliability of methods that rely on neutral null distributions. In recent years, machine learning (ML) methods have been increasingly applied to population genetic questions. Here, we present a ML-based method called MaLAdapt for identifying AI loci from genome-wide sequencing data. Using an Extra-Trees Classifier algorithm, our method combines information from a large number of biologically meaningful summary statistics to capture a powerful composite signature of AI across the genome. In contrast to existing methods, MaLAdapt is especially well-powered to detect AI with mild beneficial effects, including selection on standing archaic variation, and is robust to non-AI selective sweeps, heterosis from deleterious mutations, and demographic misspecification. Furthermore, MaLAdapt outperforms existing methods for detecting AI based on the analysis of simulated data and the validation of empirical signals through visual inspection of haplotype patterns. We apply MaLAdapt to the 1000 Genomes Project human genomic data and discover novel AI candidate regions in non-African populations, including genes that are enriched in functionally important biological pathways regulating metabolism and immune responses.
KW - adaptive introgression
KW - archaic hominins
KW - machine learning
KW - modern humans
KW - population history
UR - http://www.scopus.com/inward/record.url?scp=85147143721&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85147143721&partnerID=8YFLogxK
U2 - 10.1093/molbev/msad001
DO - 10.1093/molbev/msad001
M3 - Article
C2 - 36617238
AN - SCOPUS:85147143721
SN - 0737-4038
VL - 40
JO - Molecular biology and evolution
JF - Molecular biology and evolution
IS - 1
M1 - msad001
ER -