TY - JOUR
T1 - Machine learning-guided engineering of genetically encoded fluorescent calcium indicators
AU - Wait, Sarah J.
AU - Expòsit, Marc
AU - Lin, Sophia
AU - Rappleye, Michael
AU - Lee, Justin Daho
AU - Colby, Samuel A.
AU - Torp, Lily
AU - Asencio, Anthony
AU - Smith, Annette
AU - Regnier, Michael
AU - Moussavi-Harami, Farid
AU - Baker, David
AU - Kim, Christina K.
AU - Berndt, Andre
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Nature America, Inc. 2024.
PY - 2024/3
Y1 - 2024/3
N2 - Here we used machine learning to engineer genetically encoded fluorescent indicators, protein-based sensors critical for real-time monitoring of biological activity. We used machine learning to predict the outcomes of sensor mutagenesis by analyzing established libraries that link sensor sequences to functions. Using the GCaMP calcium indicator as a scaffold, we developed an ensemble of three regression models trained on experimentally derived GCaMP mutation libraries. The trained ensemble performed an in silico functional screen on 1,423 novel, uncharacterized GCaMP variants. As a result, we identified the ensemble-derived GCaMP (eGCaMP) variants, eGCaMP and eGCaMP+, which achieve both faster kinetics and larger ∆F/F0 responses upon stimulation than previously published fast variants. Furthermore, we identified a combinatorial mutation with extraordinary dynamic range, eGCaMP2+, which outperforms the tested sixth-, seventh- and eighth-generation GCaMPs. These findings demonstrate the value of machine learning as a tool to facilitate the efficient engineering of proteins for desired biophysical characteristics.
AB - Here we used machine learning to engineer genetically encoded fluorescent indicators, protein-based sensors critical for real-time monitoring of biological activity. We used machine learning to predict the outcomes of sensor mutagenesis by analyzing established libraries that link sensor sequences to functions. Using the GCaMP calcium indicator as a scaffold, we developed an ensemble of three regression models trained on experimentally derived GCaMP mutation libraries. The trained ensemble performed an in silico functional screen on 1,423 novel, uncharacterized GCaMP variants. As a result, we identified the ensemble-derived GCaMP (eGCaMP) variants, eGCaMP and eGCaMP+, which achieve both faster kinetics and larger ∆F/F0 responses upon stimulation than previously published fast variants. Furthermore, we identified a combinatorial mutation with extraordinary dynamic range, eGCaMP2+, which outperforms the tested sixth-, seventh- and eighth-generation GCaMPs. These findings demonstrate the value of machine learning as a tool to facilitate the efficient engineering of proteins for desired biophysical characteristics.
UR - http://www.scopus.com/inward/record.url?scp=85188739477&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85188739477&partnerID=8YFLogxK
U2 - 10.1038/s43588-024-00611-w
DO - 10.1038/s43588-024-00611-w
M3 - Article
C2 - 38532137
AN - SCOPUS:85188739477
SN - 2662-8457
VL - 4
SP - 224
EP - 236
JO - Nature Computational Science
JF - Nature Computational Science
IS - 3
ER -