TY - JOUR
T1 - Copy number evolution with weighted aberrations in cancer
AU - Zeira, Ron
AU - Raphael, Benjamin J.
N1 - Funding Information:
This work was supported by the US National Institutes of Health (NIH) [U24CA211000]; US National Science Foundation (NSF) CAREER Award [CCF-1053753]; O’Brien Family Fund for Health Research; Wilke Family Fund for Innovation; and Chan Zuckerberg Initiative DAF [2018-182608 to B.J.R.].
Funding Information:
We thank Simone Zaccaria for his help with the single-cell CN data. This work was supported by the US National Institutes of Health (NIH) [U24CA211000]; US National Science Foundation (NSF) CAREER Award [CCF-1053753]; O'Brien Family Fund for Health Research; Wilke Family Fund for Innovation; and Chan Zuckerberg Initiative DAF [2018-182608 to B.J.R.].
Publisher Copyright:
© The Author(s) 2020. Published by Oxford University Press.
PY - 2020
Y1 - 2020
N2 - Motivation: Copy number aberrations (CNAs), which delete or amplify large contiguous segments of the genome, are a common type of somatic mutation in cancer. Copy number profiles, representing the number of copies of each region of a genome, are readily obtained from whole-genome sequencing or microarrays. However, modeling copy number evolution is a substantial challenge, because different CNAs may overlap with one another on the genome. A recent popular model for copy number evolution is the copy number distance (CND), defined as the length of a shortest sequence of deletions and amplifications of contiguous segments that transforms one profile into the other. In the CND, all events contribute equally; however, it is well known that rates of CNAs vary by length, genomic position and type (amplification versus deletion). Results: We introduce a weighted CND that allows events to have varying weights, or probabilities, based on their length, position and type. We derive an efficient algorithm to compute the weighted CND as well as the associated transformation. This algorithm is based on the observation that the constraint matrix of the underlying optimization problem is totally unimodular. We show that the weighted CND improves phylogenetic reconstruction on simulated data where CNAs occur with varying probabilities, aids in the derivation of phylogenies from ultra-low-coverage single-cell DNA sequencing data and helps estimate CNA rates in a large pan-cancer dataset. Availability and implementation: Code is available at https://github.com/raphael-group/WCND.
AB - Motivation: Copy number aberrations (CNAs), which delete or amplify large contiguous segments of the genome, are a common type of somatic mutation in cancer. Copy number profiles, representing the number of copies of each region of a genome, are readily obtained from whole-genome sequencing or microarrays. However, modeling copy number evolution is a substantial challenge, because different CNAs may overlap with one another on the genome. A recent popular model for copy number evolution is the copy number distance (CND), defined as the length of a shortest sequence of deletions and amplifications of contiguous segments that transforms one profile into the other. In the CND, all events contribute equally; however, it is well known that rates of CNAs vary by length, genomic position and type (amplification versus deletion). Results: We introduce a weighted CND that allows events to have varying weights, or probabilities, based on their length, position and type. We derive an efficient algorithm to compute the weighted CND as well as the associated transformation. This algorithm is based on the observation that the constraint matrix of the underlying optimization problem is totally unimodular. We show that the weighted CND improves phylogenetic reconstruction on simulated data where CNAs occur with varying probabilities, aids in the derivation of phylogenies from ultra-low-coverage single-cell DNA sequencing data and helps estimate CNA rates in a large pan-cancer dataset. Availability and implementation: Code is available at https://github.com/raphael-group/WCND.
UR - http://www.scopus.com/inward/record.url?scp=85087922876&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85087922876&partnerID=8YFLogxK
U2 - 10.1093/BIOINFORMATICS/BTAA470
DO - 10.1093/BIOINFORMATICS/BTAA470
M3 - Article
C2 - 32657354
AN - SCOPUS:85087922876
SN - 1367-4803
VL - 36
SP - I344-I352
JO - Bioinformatics
JF - Bioinformatics
ER -