TY - JOUR
T1 - Detection of recurrent rearrangement breakpoints from copy number data
AU - Ritz, Anna
AU - Paris, Pamela L.
AU - Ittmann, Michael M.
AU - Collins, Colin
AU - Raphael, Benjamin J.
N1 - Funding Information:
We thank Chip Lawrence, Bill Thompson, and Eric Ruggieri for technical discussions, and Brendan Hickey and Hsin-Ta Wu for their contributions to preliminary analysis of fusion genes. We also thank the anonymous reviewers of an earlier version of the manuscript for helpful suggestions. AR is supported by a National Science Foundation Graduate Research Fellowship. BJR is supported by a Career Award at the Scientific Interface from the Burroughs Wellcome Fund, DOD/CDMRP Breast Cancer Synergy Award W81XWH-07-1-0710, and the Susan G. Komen Breast Cancer Foundation. This work was made possible in part with funding from the ADVANCE Program at Brown University, under NSF Grant No. 0548311. Prostate data sample collection was funded by the National Cancer Institute to the Baylor Prostate Cancer SPORE (P50CA058204)
PY - 2011/4/21
Y1 - 2011/4/21
N2 - Background: Copy number variants (CNVs), including deletions, amplifications, and other rearrangements, are common in human and cancer genomes. Copy number data from array comparative genome hybridization (aCGH) and next-generation DNA sequencing is widely used to measure copy number variants. Comparison of copy number data from multiple individuals reveals recurrent variants. Typically, the interior of a recurrent CNV is examined for genes or other loci associated with a phenotype. However, in some cases, such as gene truncations and fusion genes, the target of variant lies at the boundary of the variant.Results: We introduce Neighborhood Breakpoint Conservation (NBC), an algorithm for identifying rearrangement breakpoints that are highly conserved at the same locus in multiple individuals. NBC detects recurrent breakpoints at varying levels of resolution, including breakpoints whose location is exactly conserved and breakpoints whose location varies within a gene. NBC also identifies pairs of recurrent breakpoints such as those that result from fusion genes. We apply NBC to aCGH data from 36 primary prostate tumors and identify 12 novel rearrangements, one of which is the well-known TMPRSS2-ERG fusion gene. We also apply NBC to 227 glioblastoma tumors and predict 93 novel rearrangements which we further classify as gene truncations, germline structural variants, and fusion genes. A number of these variants involve the protein phosphatase PTPN12 suggesting that deregulation of PTPN12, via a variety of rearrangements, is common in glioblastoma.Conclusions: We demonstrate that NBC is useful for detection of recurrent breakpoints resulting from copy number variants or other structural variants, and in particular identifies recurrent breakpoints that result in gene truncations or fusion genes. Software is available at http://http.//cs.brown.edu/people/braphael/software.html.
AB - Background: Copy number variants (CNVs), including deletions, amplifications, and other rearrangements, are common in human and cancer genomes. Copy number data from array comparative genome hybridization (aCGH) and next-generation DNA sequencing is widely used to measure copy number variants. Comparison of copy number data from multiple individuals reveals recurrent variants. Typically, the interior of a recurrent CNV is examined for genes or other loci associated with a phenotype. However, in some cases, such as gene truncations and fusion genes, the target of variant lies at the boundary of the variant.Results: We introduce Neighborhood Breakpoint Conservation (NBC), an algorithm for identifying rearrangement breakpoints that are highly conserved at the same locus in multiple individuals. NBC detects recurrent breakpoints at varying levels of resolution, including breakpoints whose location is exactly conserved and breakpoints whose location varies within a gene. NBC also identifies pairs of recurrent breakpoints such as those that result from fusion genes. We apply NBC to aCGH data from 36 primary prostate tumors and identify 12 novel rearrangements, one of which is the well-known TMPRSS2-ERG fusion gene. We also apply NBC to 227 glioblastoma tumors and predict 93 novel rearrangements which we further classify as gene truncations, germline structural variants, and fusion genes. A number of these variants involve the protein phosphatase PTPN12 suggesting that deregulation of PTPN12, via a variety of rearrangements, is common in glioblastoma.Conclusions: We demonstrate that NBC is useful for detection of recurrent breakpoints resulting from copy number variants or other structural variants, and in particular identifies recurrent breakpoints that result in gene truncations or fusion genes. Software is available at http://http.//cs.brown.edu/people/braphael/software.html.
UR - http://www.scopus.com/inward/record.url?scp=79954600754&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79954600754&partnerID=8YFLogxK
U2 - 10.1186/1471-2105-12-114
DO - 10.1186/1471-2105-12-114
M3 - Article
C2 - 21510904
AN - SCOPUS:79954600754
SN - 1471-2105
VL - 12
JO - BMC bioinformatics
JF - BMC bioinformatics
M1 - 114
ER -