TY - GEN
T1 - Maximum Likelihood Inference of Time-Scaled Cell Lineage Trees with Mixed-Type Missing Data
AU - Mai, Uyen
AU - Chu, Gillian
AU - Raphael, Benjamin J.
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024
Y1 - 2024
N2 - Recent dynamic lineage tracing technologies combine CRISPR-based genome editing with single-cell sequencing to track cell divisions during development. A key problem in lineage tracing is to infer a cell lineage tree from the measured CRISPR-induced mutations. Several features of lineage tracing data distinguish this problem from standard phylogenetic tree inference: CRISPR-induced mutations are non-modifiable and can result in distinct sets of possible mutations at each target site; the number of mutations decreases over time due to non-modifiability; and CRISPR-based genome-editing and single-cell sequencing results in high rates of both heritable and non-heritable (dropout) missing data. To model these features, we introduce the Probabilistic Mixed-type Missing (PMM) model. We describe an algorithm, LAML (Lineage Analysis via Maximum Likelihood), to compute a maximum likelihood tree under the PMM model. LAML combines an Expectation Maximization (EM) algorithm with a heuristic tree search to jointly estimate tree topology, branch lengths and missing data parameters.
AB - Recent dynamic lineage tracing technologies combine CRISPR-based genome editing with single-cell sequencing to track cell divisions during development. A key problem in lineage tracing is to infer a cell lineage tree from the measured CRISPR-induced mutations. Several features of lineage tracing data distinguish this problem from standard phylogenetic tree inference: CRISPR-induced mutations are non-modifiable and can result in distinct sets of possible mutations at each target site; the number of mutations decreases over time due to non-modifiability; and CRISPR-based genome-editing and single-cell sequencing results in high rates of both heritable and non-heritable (dropout) missing data. To model these features, we introduce the Probabilistic Mixed-type Missing (PMM) model. We describe an algorithm, LAML (Lineage Analysis via Maximum Likelihood), to compute a maximum likelihood tree under the PMM model. LAML combines an Expectation Maximization (EM) algorithm with a heuristic tree search to jointly estimate tree topology, branch lengths and missing data parameters.
KW - cell phylogeny inference
KW - evolutionary model
KW - maximum likelihood
UR - http://www.scopus.com/inward/record.url?scp=85194290463&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85194290463&partnerID=8YFLogxK
U2 - 10.1007/978-1-0716-3989-4_31
DO - 10.1007/978-1-0716-3989-4_31
M3 - Conference contribution
AN - SCOPUS:85194290463
SN - 9781071639887
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 360
EP - 363
BT - Research in Computational Molecular Biology - 28th Annual International Conference, RECOMB 2024, Proceedings
A2 - Ma, Jian
PB - Springer Science and Business Media Deutschland GmbH
T2 - 28th International Conference on Research in Computational Molecular Biology, RECOMB 2024
Y2 - 29 April 2024 through 2 May 2024
ER -