We have explored the distributions of fully conserved ungapped blocks in genome-wide pair-wise alignments of recently completed species of Drosophila: D. melanogaster, D. yakuba, D. ananassae, D. pseudoobscura, D. virilis, and D. mojavensis. Based on these distributions we have found that nearly every functional sequence category possesses its own distinctive conservation pattern, sometimes independent of the overall sequence conservation level. In the coding and regulatory regions, the ungapped blocks were longer than in introns, UTRs, and nonfunctional sequences. At the same time, the blocks in the coding regions carried a 3N + 2 signature characteristic of synonymous substitutions in the third-codon position. Larger block sizes in transcription regulatory regions can be explained by the presence of conserved arrays of binding sites for transcription factors. We also have shown that the longest ungapped blocks, or "ultraconserved" sequences, are associated with specific gene groups, including those encoding ion channels and components of the cytoskeleton. We discuss how restraining conservation patterns may help in mapping functional sequence categories and improve genome annotation.
All Science Journal Classification (ASJC) codes
- Conservation pattern
- Developmental enhancer
- Vista alignment