Phylogenetic copy-number factorization of multiple tumor samples

Simone Zaccaria, Mohammed El-Kebir, Gunnar W. Klau, Benjamin J. Raphael

Research output: Contribution to journalArticlepeer-review

21 Scopus citations


Cancer is an evolutionary process driven by somatic mutations. This process can be represented as a phylogenetic tree. Constructing such a phylogenetic tree from genome sequencing data is a challenging task due to the many types of mutations in cancer and the fact that nearly all cancer sequencing is of a bulk tumor, measuring a superposition of somatic mutations present in different cells. We study the problem of reconstructing tumor phylogenies from copy-number aberrations (CNAs) measured in bulk-sequencing data. We introduce the Copy-Number Tree Mixture Deconvolution (CNTMD) problem, which aims to find the phylogenetic tree with the fewest number of CNAs that explain the copy-number data from multiple samples of a tumor. We design an algorithm for solving the CNTMD problem and apply the algorithm to both simulated and real data. On simulated data, we find that our algorithm outperforms existing approaches that either perform deconvolution/factorization of mixed tumor samples or build phylogenetic trees assuming homogeneous tumor samples. On real data, we analyze multiple samples from a prostate cancer patient, identifying clones within these samples and a phylogenetic tree that relates these clones and their differing proportions across samples. This phylogenetic tree provides a higher resolution view of copy-number evolution of this cancer than published analyses.

Original languageEnglish (US)
Pages (from-to)689-708
Number of pages20
JournalJournal of Computational Biology
Issue number7
StatePublished - Jul 2018

All Science Journal Classification (ASJC) codes

  • Computational Mathematics
  • Genetics
  • Molecular Biology
  • Computational Theory and Mathematics
  • Modeling and Simulation


  • copy-number aberrations
  • factorization
  • integer linear programming
  • intratumor heterogeneity
  • multiple tumor samples
  • tumor phylogeny.


Dive into the research topics of 'Phylogenetic copy-number factorization of multiple tumor samples'. Together they form a unique fingerprint.

Cite this