Analysis of segmental duplications via duplication distance

Crystal L. Kahn, Benjamin J. Raphael

Research output: Contribution to journalArticlepeer-review

15 Scopus citations


Motivation: Segmental duplications are common in mammalian genomes, but their evolutionary origins remain mysterious. A major difficulty in analyzing segmental duplications is that many duplications are complex mosaics of fragments of numerous other segmental duplications. Results: We introduce a novel measure called duplication distance that describes the minimum number of duplications necessary to create a target string by repeated insertions of fragments of a source string. We derive an efficient algorithm to compute duplication distance, and we use the algorithm to analyze segmental duplications in the human genome. Our analysis reveals possible ancestral relationships between segmental duplications including numerous examples of duplications that contain multiple, nested insertions of fragments from one or more other duplications. Using duplication distance, we also identify a small number of segmental duplications that appear to have seeded many other duplications in the genome, lending support to a two-step model of segmental duplication in the genome.

Original languageEnglish (US)
Pages (from-to)i133-i138
Issue number16
StatePublished - Aug 2008
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Computational Mathematics
  • Molecular Biology
  • Biochemistry
  • Statistics and Probability
  • Computer Science Applications
  • Computational Theory and Mathematics


Dive into the research topics of 'Analysis of segmental duplications via duplication distance'. Together they form a unique fingerprint.

Cite this