We develop methods to infer levels of evolutionary constraints in the genome by comparing rates of nucleotide substitution in noncoding DNA with rates predicted from rates of synonymous site evolution in adjacent genes or other putatively neutrally evolving sites, while accounting for differences in base composition. We apply the methods to estimate levels of constraint in noncoding DNA of Drosophila. In introns, constraint (the estimated fraction of mutations that are selectively eliminated) is absolute at the 5′ and 3′ splice junction dinucleotides, and averages 72% in base pairs 3-6 at the 5′-end. Constraint at the 5′ base pairs 3-6 is significantly lower in the lineage leading to Drosophila melanogaster than in Drosophila simulans, a finding that agrees with other features of genome evolution in Drosophila and indicates that the effect of selection on intron function has been weaker in the melanogaster lineage. Elsewhere in intron sequences, the rate of nucleotide substitution is significantly higher than at synonymous sites. By using intronic sites outside splice control regions as a putative neutrally evolving standard, constraint in the 500 bp of intergenic DNA upstream and downstream regions of protein-coding genes averages ∼44%. Although the estimated level of constraint in intergenic regions close to genes is only about one-half of that of amino acid sites, selection against single-nucleotide mutations in intergenic DNA makes a substantial contribution to the mutation load in Drosophila.
All Science Journal Classification (ASJC) codes