TY - GEN
T1 - Identification and frequency estimation of inversion polymorphisms from haplotype data
AU - Sindi, Suzanne S.
AU - Raphael, Benjamin J.
PY - 2009/7/17
Y1 - 2009/7/17
N2 - Structural rearrangements, including copy-number alterations and inversions, are increasingly recognized as an important contributor to human genetic variation. Copy number variants are readily measured via array-based techniques like comparative genomic hybridization, but copy-neutral variants such as inversion polymorphisms remain difficult to identify without whole genome sequencing. We introduce a method to identify inversion polymorphisms and estimate their frequency in a population using readily available single nucleotide polymorphism (SNP) data. Our method uses a probabilistic model to describe a population as a mixture of forward and inverted chromosomes and identifies putative inversions by characteristic differences in haplotype frequencies around inversion breakpoints. On simulated data, our method accurately predicts inversions with frequencies as low as 25% in the population and reliably estimates inversion frequencies over a wide range. On the human HapMap Phase 2 data, we predict between 88 and 142 inversion polymorphisms with frequency ranging from 20 to 92 percent. Many of these correspond to known inversions or have other evidence supporting them, and the predicted inversion frequencies largely agree with the limited information presently available.
AB - Structural rearrangements, including copy-number alterations and inversions, are increasingly recognized as an important contributor to human genetic variation. Copy number variants are readily measured via array-based techniques like comparative genomic hybridization, but copy-neutral variants such as inversion polymorphisms remain difficult to identify without whole genome sequencing. We introduce a method to identify inversion polymorphisms and estimate their frequency in a population using readily available single nucleotide polymorphism (SNP) data. Our method uses a probabilistic model to describe a population as a mixture of forward and inverted chromosomes and identifies putative inversions by characteristic differences in haplotype frequencies around inversion breakpoints. On simulated data, our method accurately predicts inversions with frequencies as low as 25% in the population and reliably estimates inversion frequencies over a wide range. On the human HapMap Phase 2 data, we predict between 88 and 142 inversion polymorphisms with frequency ranging from 20 to 92 percent. Many of these correspond to known inversions or have other evidence supporting them, and the predicted inversion frequencies largely agree with the limited information presently available.
UR - http://www.scopus.com/inward/record.url?scp=67650287133&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=67650287133&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-02008-7_30
DO - 10.1007/978-3-642-02008-7_30
M3 - Conference contribution
AN - SCOPUS:67650287133
SN - 9783642020070
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 418
EP - 433
BT - Research in Computational Molecular Biology - 13th Annual International Conference, RECOMB 2009, Proceedings
T2 - 13th Annual International Conference on Research in Computational Molecular Biology, RECOMB 2009
Y2 - 18 May 2009 through 21 May 2009
ER -