Efficient algorithms for analyzing segmental duplications with deletions and inversions in genomes

Crystal L. Kahn, Shay Mozes, Benjamin J. Raphael

Research output: Contribution to journalArticlepeer-review

4 Scopus citations


Background: Segmental duplications, or low-copy repeats, are common in mammalian genomes. In the human genome, most segmental duplications are mosaics comprised of multiple duplicated fragments. This complex genomic organization complicates analysis of the evolutionary history of these sequences. One model proposed to explain this mosaic patterns is a model of repeated aggregation and subsequent duplication of genomic sequences.Results: We describe a polynomial-time exact algorithm to compute duplication distance, a genomic distance defined as the most parsimonious way to build a target string by repeatedly copying substrings of a fixed source string. This distance models the process of repeated aggregation and duplication. We also describe extensions of this distance to include certain types of substring deletions and inversions. Finally, we provide a description of a sequence of duplication events as a context-free grammar (CFG).Conclusion: These new genomic distances will permit more biologically realistic analyses of segmental duplications in genomes.

Original languageEnglish (US)
Article number11
JournalAlgorithms for Molecular Biology
Issue number1
StatePublished - Jan 4 2010
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Applied Mathematics
  • Molecular Biology
  • Structural Biology
  • Computational Theory and Mathematics


Dive into the research topics of 'Efficient algorithms for analyzing segmental duplications with deletions and inversions in genomes'. Together they form a unique fingerprint.

Cite this