TY - JOUR
T1 - Detecting independent and recurrent copy number aberrations using interval graphs
AU - Wu, Hsin Ta
AU - Hajirasouliha, Iman
AU - Raphael, Benjamin J.
N1 - Funding Information:
Funding: National Science Foundation CAREER Award (CCF-1053753 to B.J.R.); the National Institutes of Health (R01HG5690 to B.J.R.); Career Award at the Scientific Interface from the Burroughs Wellcome Fund (to B.J.R.); an Alfred P. Sloan Research Fellowship (to B.J.R.); Natural Sciences and Engineering Research Council of Canada (NSERC) Postdoctoral Fellowship (to I.H.).
PY - 2014/6/15
Y1 - 2014/6/15
N2 - Motivation: Somatic copy number aberrations (SCNAs) are frequent in cancer genomes, but many of these are random, passenger events. A common strategy to distinguish functional aberrations from passengers is to identify those aberrations that are recurrent across multiple samples. However, the extensive variability in the length and position of SCNAs makes the problem of identifying recurrent aberrations notoriously difficult. Results: We introduce a combinatorial approach to the problem of identifying independent and recurrent SCNAs, focusing on the key challenging of separating the overlaps in aberrations across individuals into independent events. We derive independent and recurrent SCNAs as maximal cliques in an interval graph constructed from overlaps between aberrations. We efficiently enumerate all such cliques, and derive a dynamic programming algorithm to find an optimal selection of non-overlapping cliques, resulting in a very fast algorithm, which we call RAIG (Recurrent Aberrations from Interval Graphs). We show that RAIG outperforms other methods on simulated data and also performs well on data from three cancer types from The Cancer Genome Atlas (TCGA). In contrast to existing approaches that employ various heuristics to select independent aberrations, RAIG optimizes a well-defined objective function. We show that this allows RAIG to identify rare aberrations that are likely functional, but are obscured by overlaps with larger passenger aberrations.
AB - Motivation: Somatic copy number aberrations (SCNAs) are frequent in cancer genomes, but many of these are random, passenger events. A common strategy to distinguish functional aberrations from passengers is to identify those aberrations that are recurrent across multiple samples. However, the extensive variability in the length and position of SCNAs makes the problem of identifying recurrent aberrations notoriously difficult. Results: We introduce a combinatorial approach to the problem of identifying independent and recurrent SCNAs, focusing on the key challenging of separating the overlaps in aberrations across individuals into independent events. We derive independent and recurrent SCNAs as maximal cliques in an interval graph constructed from overlaps between aberrations. We efficiently enumerate all such cliques, and derive a dynamic programming algorithm to find an optimal selection of non-overlapping cliques, resulting in a very fast algorithm, which we call RAIG (Recurrent Aberrations from Interval Graphs). We show that RAIG outperforms other methods on simulated data and also performs well on data from three cancer types from The Cancer Genome Atlas (TCGA). In contrast to existing approaches that employ various heuristics to select independent aberrations, RAIG optimizes a well-defined objective function. We show that this allows RAIG to identify rare aberrations that are likely functional, but are obscured by overlaps with larger passenger aberrations.
UR - http://www.scopus.com/inward/record.url?scp=84902437473&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84902437473&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btu276
DO - 10.1093/bioinformatics/btu276
M3 - Article
C2 - 24931984
AN - SCOPUS:84902437473
SN - 1367-4803
VL - 30
SP - I195-I203
JO - Bioinformatics
JF - Bioinformatics
IS - 12
ER -