TY - JOUR
T1 - Designing sensitive viral diagnostics with machine learning
AU - Metsky, Hayden C.
AU - Welch, Nicole L.
AU - Pillai, Priya P.
AU - Haradhvala, Nicholas J.
AU - Rumker, Laurie
AU - Mantena, Sreekar
AU - Zhang, Yibin B.
AU - Yang, David K.
AU - Ackerman, Cheri M.
AU - Weller, Juliane
AU - Blainey, Paul C.
AU - Myhrvold, Cameron
AU - Mitzenmacher, Michael
AU - Sabeti, Pardis C.
N1 - Funding Information:
We thank B. Petros, Y. Singer, M. O’Connell, R. Tuyeras and D. Kassler for discussions and pointers. This project was made possible by DARPA grant no. D18AC00006; HHMI; the Amazon Web Services Diagnostic Development Initiative; Flu Lab; and a cohort of donors through the Audacious Project, a collaborative funding initiative housed at TED, including the ELMA Foundation, MacKenzie Scott, the Skoll Foundation and Open Philanthropy. H.C.M. was supported by NIH/NIAID grant no. K01AI163498. N.J.H. was funded by the Landry Cancer Biology Consortium Fellowship and NIH/NIGMS grant no. T32GM008313. C.M.A. was supported by NIH grant no. F32CA236425. M.M. was funded by NSF grants no. CCF-1535795, no. CCF-1563710, and no. CCF-2101140.
Publisher Copyright:
© 2022, The Author(s).
PY - 2022/7
Y1 - 2022/7
N2 - Design of nucleic acid-based viral diagnostics typically follows heuristic rules and, to contend with viral variation, focuses on a genome’s conserved regions. A design process could, instead, directly optimize diagnostic effectiveness using a learned model of sensitivity for targets and their variants. Toward that goal, we screen 19,209 diagnostic–target pairs, concentrated on CRISPR-based diagnostics, and train a deep neural network to accurately predict diagnostic readout. We join this model with combinatorial optimization to maximize sensitivity over the full spectrum of a virus’s genomic variation. We introduce Activity-informed Design with All-inclusive Patrolling of Targets (ADAPT), a system for automated design, and use it to design diagnostics for 1,933 vertebrate-infecting viral species within 2 hours for most species and within 24 hours for all but three. We experimentally show that ADAPT’s designs are sensitive and specific to the lineage level and permit lower limits of detection, across a virus’s variation, than the outputs of standard design techniques. Our strategy could facilitate a proactive resource of assays for detecting pathogens.
AB - Design of nucleic acid-based viral diagnostics typically follows heuristic rules and, to contend with viral variation, focuses on a genome’s conserved regions. A design process could, instead, directly optimize diagnostic effectiveness using a learned model of sensitivity for targets and their variants. Toward that goal, we screen 19,209 diagnostic–target pairs, concentrated on CRISPR-based diagnostics, and train a deep neural network to accurately predict diagnostic readout. We join this model with combinatorial optimization to maximize sensitivity over the full spectrum of a virus’s genomic variation. We introduce Activity-informed Design with All-inclusive Patrolling of Targets (ADAPT), a system for automated design, and use it to design diagnostics for 1,933 vertebrate-infecting viral species within 2 hours for most species and within 24 hours for all but three. We experimentally show that ADAPT’s designs are sensitive and specific to the lineage level and permit lower limits of detection, across a virus’s variation, than the outputs of standard design techniques. Our strategy could facilitate a proactive resource of assays for detecting pathogens.
UR - http://www.scopus.com/inward/record.url?scp=85125523596&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85125523596&partnerID=8YFLogxK
U2 - 10.1038/s41587-022-01213-5
DO - 10.1038/s41587-022-01213-5
M3 - Article
C2 - 35241837
AN - SCOPUS:85125523596
SN - 1087-0156
VL - 40
SP - 1123
EP - 1131
JO - Nature Biotechnology
JF - Nature Biotechnology
IS - 7
ER -