TY - GEN
T1 - Identification of deletion polymorphisms from haplotypes
AU - Corona, Erik
AU - Raphael, Benjamin
AU - Eskin, Eleazar
PY - 2007
Y1 - 2007
N2 - Numerous efforts are underway to catalog genetic variation in human populations. While the majority of studies of genetic variation have focused on single base pair differences between individuals, i.e. single nucleotide polymorphisms (SNPs), several recent studies have demonstrated that larger scale structural variation including copy number polymorphisms and inversion polymorphisms are also common. However, direct techniques for detection and validation of structural variants are generally much more expensive than detection and validation of SNPs. For some types of structural variation, in particular deletions, the polymorphism produces a distinct signature in the SNP data. In this paper, we describe a new probabilistic method for detecting deletion polymorphisms from SNP data. The key idea in our method is that we estimate the frequency of the haplotypes in a region of the genome both with and without the possibility of a deletion in the region and apply a generalized likelihood ratio test to assess the significance of a deletion. Application of our method to the HapMap Phase 1 data revealed 319 candidate deletions, 142 of these overlap with variants identified in earlier studies, while 177 are novel. Using Phase II HapMap data we predict 6730 deletions.
AB - Numerous efforts are underway to catalog genetic variation in human populations. While the majority of studies of genetic variation have focused on single base pair differences between individuals, i.e. single nucleotide polymorphisms (SNPs), several recent studies have demonstrated that larger scale structural variation including copy number polymorphisms and inversion polymorphisms are also common. However, direct techniques for detection and validation of structural variants are generally much more expensive than detection and validation of SNPs. For some types of structural variation, in particular deletions, the polymorphism produces a distinct signature in the SNP data. In this paper, we describe a new probabilistic method for detecting deletion polymorphisms from SNP data. The key idea in our method is that we estimate the frequency of the haplotypes in a region of the genome both with and without the possibility of a deletion in the region and apply a generalized likelihood ratio test to assess the significance of a deletion. Application of our method to the HapMap Phase 1 data revealed 319 candidate deletions, 142 of these overlap with variants identified in earlier studies, while 177 are novel. Using Phase II HapMap data we predict 6730 deletions.
UR - http://www.scopus.com/inward/record.url?scp=34547452119&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34547452119&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-71681-5_25
DO - 10.1007/978-3-540-71681-5_25
M3 - Conference contribution
AN - SCOPUS:34547452119
SN - 3540716807
SN - 9783540716808
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 354
EP - 365
BT - Research in Computational Molecular Biology - 11th Annual International Conference, RECOMB 2007, Proceedings
PB - Springer Verlag
T2 - 11th Annual International Conference on Research in Computational Molecular Biology, RECOMB 2007
Y2 - 21 April 2007 through 25 April 2007
ER -