Many cancer genomes are extensively rearranged with aberrant chromosomal karyotypes. Deriving these karyotypes from high-throughput DNA sequencing of bulk tumor samples is complicated because most tumors are a heterogeneous mixture of normal cells and subpopulations of cancer cells, or clones, that harbor distinct somatic mutations. We introduce a new algorithm, Reconstructing Cancer Karyotypes (RCK), to reconstruct haplotype-specific karyotypes of one or more rearranged cancer genomes from DNA sequencing data from a bulk tumor sample. RCK leverages evolutionary constraints on the somatic mutational process in cancer to reduce ambiguity in the deconvolution of admixed sequencing data into multiple haplotype-specific cancer karyotypes. RCK models mixtures containing an arbitrary number of derived genomes and allows the incorporation of information both from short-read and long-read DNA sequencing technologies. We compare RCK to existing approaches on 17 primary and metastatic prostate cancer samples. We find that RCK infers cancer karyotypes that better explain the DNA sequencing data and conform to a reasonable evolutionary model. RCK's reconstructions of clone- and haplotype-specific karyotypes will aid further studies of the role of intra-tumor heterogeneity in cancer development and response to treatment. RCK is freely available as open source software.
All Science Journal Classification (ASJC) codes