TY - JOUR
T1 - A geometric approach for classification and comparison of structural variants
AU - Sindi, Suzanne
AU - Helman, Elena
AU - Bashir, Ali
AU - Raphael, Benjamin J.
N1 - Funding Information:
Funding: Career Award at the Scientific Interface from the Burroughs Wellcome Fund (to B.J.R.); the Department of Defense Breast Cancer Research Program (to B.J.R.); ADVANCE Program at Brown University, which is funded by the National Science Foundation under grant number 0548311 (to B.J.R.).
PY - 2009
Y1 - 2009
N2 - Motivation: Structural variants, including duplications, insertions, deletions and inversions of large blocks of DNA sequence, are an important contributor to human genome variation. Measuring structural variants in a genome sequence is typically more challenging than measuring single nucleotide changes. Current approaches for structural variant identification, including paired-end DNA sequencing/mapping and array comparative genomic hybridization (aCGH), do not identify the boundaries of variants precisely. Consequently, most reported human structural variants are poorly defined and not readily compared across different studies and measurement techniques. Results: We introduce Geometric Analysis of Structural Variants (GASV), a geometric approach for identification, classification and comparison of structural variants. This approach represents the uncertainty in measurement of a structural variant as a polygon in the plane, and identifies measurements supporting the same variant by computing intersections of polygons. We derive a computational geometry algorithm to efficiently identify all such intersections. We apply GASV to sequencing data from nine individual human genomes and several cancer genomes. We obtain better localization of the boundaries of structural variants, distinguish genetic from putative somatic structural variants in cancer genomes, and integrate aCGH and paired-end sequencing measurements of structural variants. This work presents the first general framework for comparing structural variants across multiple samples and measurement techniques, and will be useful for studies of both genetic structural variants and somatic rearrangements in cancer.
AB - Motivation: Structural variants, including duplications, insertions, deletions and inversions of large blocks of DNA sequence, are an important contributor to human genome variation. Measuring structural variants in a genome sequence is typically more challenging than measuring single nucleotide changes. Current approaches for structural variant identification, including paired-end DNA sequencing/mapping and array comparative genomic hybridization (aCGH), do not identify the boundaries of variants precisely. Consequently, most reported human structural variants are poorly defined and not readily compared across different studies and measurement techniques. Results: We introduce Geometric Analysis of Structural Variants (GASV), a geometric approach for identification, classification and comparison of structural variants. This approach represents the uncertainty in measurement of a structural variant as a polygon in the plane, and identifies measurements supporting the same variant by computing intersections of polygons. We derive a computational geometry algorithm to efficiently identify all such intersections. We apply GASV to sequencing data from nine individual human genomes and several cancer genomes. We obtain better localization of the boundaries of structural variants, distinguish genetic from putative somatic structural variants in cancer genomes, and integrate aCGH and paired-end sequencing measurements of structural variants. This work presents the first general framework for comparing structural variants across multiple samples and measurement techniques, and will be useful for studies of both genetic structural variants and somatic rearrangements in cancer.
UR - http://www.scopus.com/inward/record.url?scp=66349083341&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=66349083341&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btp208
DO - 10.1093/bioinformatics/btp208
M3 - Article
C2 - 19477992
AN - SCOPUS:66349083341
SN - 1367-4803
VL - 25
SP - i222-i230
JO - Bioinformatics
JF - Bioinformatics
IS - 12
ER -