TY - GEN

T1 - Genome halving and aliquoting under the copy number distance

AU - Zeira, Ron

AU - Mon, Geoffrey

AU - Raphael, Benjamin J.

N1 - Funding Information:
Large-scale genome rearrangements occur frequently in species evolution and cancer evolution. While the computation of evolutionary distances is tractable for balanced rearrangements, such as inversions and translocations, computing distances involving duplications and deletions is much more difficult. In the recently proposed Copy Number Distance (CND) model, a genome is represented as a Copy Number Profile (CNP), a sequence of integers, and the CND between two CNPs is the length of a shortest sequence of deletions and amplifications of contiguous segments that transforms one CNP into the other. In addition to these segmental events, genomes also undergo global events such as Whole Genome Duplication (WGD) or polyploidization that multiply the entire genome content. These global events are common and important in both species and cancer evolution. In this paper, we formulate the genome halving problem of finding a closest preduplication CNP that has undergone a WGD and evolved into a given CNP under the CND model. We also formulate the analogous genome aliquoting problem of finding the closest prepolyploidzation CNP under the CND distance. We give a linear time algorithm for the halving distance and a quadratic time dynamic programming algorithm for the aliquoting distance. We implement these algorithms and show that they produce reasonable solutions on simulated CNPs. 2012 ACM Subject Classification Applied computing → Molecular evolution Keywords and phrases Genome rearrangements, Copy number distance, Whole genome duplication, polyploidization, genome halving distance, genome aliquoting distance Digital Object Identifier 10.4230/LIPIcs.WABI.2021.18 Supplementary Material Software (Source Code): https://github.com/raphael-group/CND-aliquoting Funding Benjamin J. Raphael2]Corresponding author: This work is supported by a US National Institutes of Health (NIH) grants U24CA211000 and U24CA248453.
Publisher Copyright:
© Ron Zeira, Geoffrey Mon, and Benjamin J. Raphael; licensed under Creative Commons License CC-BY 4.0 21st International Workshop on Algorithms in Bioinformatics (WABI 2021).

PY - 2021/7/1

Y1 - 2021/7/1

N2 - Large-scale genome rearrangements occur frequently in species evolution and cancer evolution. While the computation of evolutionary distances is tractable for balanced rearrangements, such as inversions and translocations, computing distances involving duplications and deletions is much more difficult. In the recently proposed Copy Number Distance (CND) model, a genome is represented as a Copy Number Profile (CNP), a sequence of integers, and the CND between two CNPs is the length of a shortest sequence of deletions and amplifications of contiguous segments that transforms one CNP into the other. In addition to these segmental events, genomes also undergo global events such as Whole Genome Duplication (WGD) or polyploidization that multiply the entire genome content. These global events are common and important in both species and cancer evolution. In this paper, we formulate the genome halving problem of finding a closest preduplication CNP that has undergone a WGD and evolved into a given CNP under the CND model. We also formulate the analogous genome aliquoting problem of finding the closest prepolyploidzation CNP under the CND distance. We give a linear time algorithm for the halving distance and a quadratic time dynamic programming algorithm for the aliquoting distance. We implement these algorithms and show that they produce reasonable solutions on simulated CNPs.

AB - Large-scale genome rearrangements occur frequently in species evolution and cancer evolution. While the computation of evolutionary distances is tractable for balanced rearrangements, such as inversions and translocations, computing distances involving duplications and deletions is much more difficult. In the recently proposed Copy Number Distance (CND) model, a genome is represented as a Copy Number Profile (CNP), a sequence of integers, and the CND between two CNPs is the length of a shortest sequence of deletions and amplifications of contiguous segments that transforms one CNP into the other. In addition to these segmental events, genomes also undergo global events such as Whole Genome Duplication (WGD) or polyploidization that multiply the entire genome content. These global events are common and important in both species and cancer evolution. In this paper, we formulate the genome halving problem of finding a closest preduplication CNP that has undergone a WGD and evolved into a given CNP under the CND model. We also formulate the analogous genome aliquoting problem of finding the closest prepolyploidzation CNP under the CND distance. We give a linear time algorithm for the halving distance and a quadratic time dynamic programming algorithm for the aliquoting distance. We implement these algorithms and show that they produce reasonable solutions on simulated CNPs.

KW - Copy number distance

KW - Genome aliquoting distance

KW - Genome halving distance

KW - Genome rearrangements

KW - Polyploidization

KW - Whole genome duplication

UR - http://www.scopus.com/inward/record.url?scp=85115289117&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85115289117&partnerID=8YFLogxK

U2 - 10.4230/LIPIcs.WABI.2021.18

DO - 10.4230/LIPIcs.WABI.2021.18

M3 - Conference contribution

AN - SCOPUS:85115289117

T3 - Leibniz International Proceedings in Informatics, LIPIcs

BT - 21st International Workshop on Algorithms in Bioinformatics, WABI 2021

A2 - Carbone, Alessandra

A2 - El-Kebir, Mohammed

PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing

T2 - 21st International Workshop on Algorithms in Bioinformatics, WABI 2021

Y2 - 2 August 2021 through 4 August 2021

ER -