TY - GEN
T1 - Genome halving and aliquoting under the copy number distance
AU - Zeira, Ron
AU - Mon, Geoffrey
AU - Raphael, Benjamin J.
N1 - Publisher Copyright:
© Ron Zeira, Geoffrey Mon, and Benjamin J. Raphael; licensed under Creative Commons License CC-BY 4.0 21st International Workshop on Algorithms in Bioinformatics (WABI 2021).
PY - 2021/7/1
Y1 - 2021/7/1
N2 - Large-scale genome rearrangements occur frequently in species evolution and cancer evolution. While the computation of evolutionary distances is tractable for balanced rearrangements, such as inversions and translocations, computing distances involving duplications and deletions is much more difficult. In the recently proposed Copy Number Distance (CND) model, a genome is represented as a Copy Number Profile (CNP), a sequence of integers, and the CND between two CNPs is the length of a shortest sequence of deletions and amplifications of contiguous segments that transforms one CNP into the other. In addition to these segmental events, genomes also undergo global events such as Whole Genome Duplication (WGD) or polyploidization that multiply the entire genome content. These global events are common and important in both species and cancer evolution. In this paper, we formulate the genome halving problem of finding a closest preduplication CNP that has undergone a WGD and evolved into a given CNP under the CND model. We also formulate the analogous genome aliquoting problem of finding the closest prepolyploidzation CNP under the CND distance. We give a linear time algorithm for the halving distance and a quadratic time dynamic programming algorithm for the aliquoting distance. We implement these algorithms and show that they produce reasonable solutions on simulated CNPs.
AB - Large-scale genome rearrangements occur frequently in species evolution and cancer evolution. While the computation of evolutionary distances is tractable for balanced rearrangements, such as inversions and translocations, computing distances involving duplications and deletions is much more difficult. In the recently proposed Copy Number Distance (CND) model, a genome is represented as a Copy Number Profile (CNP), a sequence of integers, and the CND between two CNPs is the length of a shortest sequence of deletions and amplifications of contiguous segments that transforms one CNP into the other. In addition to these segmental events, genomes also undergo global events such as Whole Genome Duplication (WGD) or polyploidization that multiply the entire genome content. These global events are common and important in both species and cancer evolution. In this paper, we formulate the genome halving problem of finding a closest preduplication CNP that has undergone a WGD and evolved into a given CNP under the CND model. We also formulate the analogous genome aliquoting problem of finding the closest prepolyploidzation CNP under the CND distance. We give a linear time algorithm for the halving distance and a quadratic time dynamic programming algorithm for the aliquoting distance. We implement these algorithms and show that they produce reasonable solutions on simulated CNPs.
KW - Copy number distance
KW - Genome aliquoting distance
KW - Genome halving distance
KW - Genome rearrangements
KW - Polyploidization
KW - Whole genome duplication
UR - http://www.scopus.com/inward/record.url?scp=85115289117&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85115289117&partnerID=8YFLogxK
U2 - 10.4230/LIPIcs.WABI.2021.18
DO - 10.4230/LIPIcs.WABI.2021.18
M3 - Conference contribution
AN - SCOPUS:85115289117
T3 - Leibniz International Proceedings in Informatics, LIPIcs
BT - 21st International Workshop on Algorithms in Bioinformatics, WABI 2021
A2 - Carbone, Alessandra
A2 - El-Kebir, Mohammed
PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
T2 - 21st International Workshop on Algorithms in Bioinformatics, WABI 2021
Y2 - 2 August 2021 through 4 August 2021
ER -