Programmable C•G-to-G•C base editors (CGBEs) have broad scientific and therapeutic potential, but their editing outcomes have proved difficult to predict and their editing efficiency and product purity are often low. We describe a suite of engineered CGBEs paired with machine learning models to enable efficient, high-purity C•G-to-G•C base editing. We performed a CRISPR interference (CRISPRi) screen targeting DNA repair genes to identify factors that affect C•G-to-G•C editing outcomes and used these insights to develop CGBEs with diverse editing profiles. We characterized ten promising CGBEs on a library of 10,638 genomically integrated target sites in mammalian cells and trained machine learning models that accurately predict the purity and yield of editing outcomes (R = 0.90) using these data. These CGBEs enable correction to the wild-type coding sequence of 546 disease-related transversion single-nucleotide variants (SNVs) with >90% precision (mean 96%) and up to 70% efficiency (mean 14%). Computational prediction of optimal CGBE–single-guide RNA pairs enables high-purity transversion base editing at over fourfold more target sites than achieved using any single CGBE variant.
All Science Journal Classification (ASJC) codes
- Applied Microbiology and Biotechnology
- Molecular Medicine
- Biomedical Engineering