Machine-learning-optimized Cas12a barcoding enables the recovery of single-cell lineages and transcriptional profiles

Nicholas W. Hughes, Yuanhao Qu, Jiaqi Zhang, Weijing Tang, Justin Pierce, Chengkun Wang, Aditi Agrawal, Maurizio Morri, Norma Neff, Monte M. Winslow, Mengdi Wang, Le Cong

Research output: Contribution to journalArticlepeer-review

11 Scopus citations


The development of CRISPR-based barcoding methods creates an exciting opportunity to understand cellular phylogenies. We present a compact, tunable, high-capacity Cas12a barcoding system called dual acting inverted site array (DAISY). We combined high-throughput screening and machine learning to predict and optimize the 60-bp DAISY barcode sequences. After optimization, top-performing barcodes had ∼10-fold increased capacity relative to the best random-screened designs and performed reliably across diverse cell types. DAISY barcode arrays generated ∼12 bits of entropy and ∼66,000 unique barcodes. Thus, DAISY barcodes—at a fraction of the size of Cas9 barcodes—achieved high-capacity barcoding. We coupled DAISY barcoding with single-cell RNA-seq to recover lineages and gene expression profiles from ∼47,000 human melanoma cells. A single DAISY barcode recovered up to ∼700 lineages from one parental cell. This analysis revealed heritable single-cell gene expression and potential epigenetic modulation of memory gene transcription. Overall, Cas12a DAISY barcoding is an efficient tool for investigating cell-state dynamics.

Original languageEnglish (US)
Pages (from-to)3103-3118.e8
JournalMolecular Cell
Issue number16
StatePublished - Aug 18 2022

All Science Journal Classification (ASJC) codes

  • Molecular Biology
  • Cell Biology


  • CRISPR barcoding
  • Cas12a
  • PRC2
  • high throughput screening
  • lineage tracking
  • machine learning
  • melanoma
  • online learning optimization
  • single cell genomics
  • transcriptional memory


Dive into the research topics of 'Machine-learning-optimized Cas12a barcoding enables the recovery of single-cell lineages and transcriptional profiles'. Together they form a unique fingerprint.

Cite this