TY - JOUR
T1 - Conservation patterns in different functional sequence categories of divergent Drosophila species
AU - Papatsenko, Dmitri
AU - Kislyuk, Andrey
AU - Levine, Michael
AU - Dubchak, Inna
N1 - Funding Information:
The authors are grateful to Michael Brudno and Alexander Poliakov for their extensive work on Drosophila alignments analyzed in the paper and Michael Cipriano for help with the manuscript. This work was supported by National Heart, Lung, and Blood Institute, National Institutes of Health, Grant U1HL66681B; the U.S. Department of Energy’s Office of Science, Biological, and Environmental Research Program Lawrence Berkeley National Laboratory Contract DE-AC03-76SF00098 to I.D.; and National Institutes of Health Grant GM 46638 to M.L.
PY - 2006/10
Y1 - 2006/10
N2 - We have explored the distributions of fully conserved ungapped blocks in genome-wide pair-wise alignments of recently completed species of Drosophila: D. melanogaster, D. yakuba, D. ananassae, D. pseudoobscura, D. virilis, and D. mojavensis. Based on these distributions we have found that nearly every functional sequence category possesses its own distinctive conservation pattern, sometimes independent of the overall sequence conservation level. In the coding and regulatory regions, the ungapped blocks were longer than in introns, UTRs, and nonfunctional sequences. At the same time, the blocks in the coding regions carried a 3N + 2 signature characteristic of synonymous substitutions in the third-codon position. Larger block sizes in transcription regulatory regions can be explained by the presence of conserved arrays of binding sites for transcription factors. We also have shown that the longest ungapped blocks, or "ultraconserved" sequences, are associated with specific gene groups, including those encoding ion channels and components of the cytoskeleton. We discuss how restraining conservation patterns may help in mapping functional sequence categories and improve genome annotation.
AB - We have explored the distributions of fully conserved ungapped blocks in genome-wide pair-wise alignments of recently completed species of Drosophila: D. melanogaster, D. yakuba, D. ananassae, D. pseudoobscura, D. virilis, and D. mojavensis. Based on these distributions we have found that nearly every functional sequence category possesses its own distinctive conservation pattern, sometimes independent of the overall sequence conservation level. In the coding and regulatory regions, the ungapped blocks were longer than in introns, UTRs, and nonfunctional sequences. At the same time, the blocks in the coding regions carried a 3N + 2 signature characteristic of synonymous substitutions in the third-codon position. Larger block sizes in transcription regulatory regions can be explained by the presence of conserved arrays of binding sites for transcription factors. We also have shown that the longest ungapped blocks, or "ultraconserved" sequences, are associated with specific gene groups, including those encoding ion channels and components of the cytoskeleton. We discuss how restraining conservation patterns may help in mapping functional sequence categories and improve genome annotation.
KW - Conservation pattern
KW - Developmental enhancer
KW - Drosophila
KW - Vista alignment
UR - http://www.scopus.com/inward/record.url?scp=33748705337&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33748705337&partnerID=8YFLogxK
U2 - 10.1016/j.ygeno.2006.03.012
DO - 10.1016/j.ygeno.2006.03.012
M3 - Article
C2 - 16697139
AN - SCOPUS:33748705337
SN - 0888-7543
VL - 88
SP - 431
EP - 442
JO - Genomics
JF - Genomics
IS - 4
ER -