Motivation: Until ab initio structure prediction methods are perfected, the estimation of structure for protein molecules will depend on combining multiple sources of experimental and theoretical data. Secondary structure predictions are a particularly useful source of structural information, but are currently only ~ 70% correct, on average. Structure computation algorithms which incorporate secondary structure information must therefore have methods for dealing with predictions that are imperfect. Experiments performed: We have modified our algorithm for probabilistic least squares structural computations to accept 'disjunctive' constraints, in which a constraint is provided as a set of possible values, each weighted with a probability. Thus, when a helix is predicted, the distances associated with a helix are given most of the weight, but some weights can be allocated to the other possibilities (strand and coil). We have tested a variety of strategies for this weighting scheme in conjunction with a baseline synthetic set of sparse distance data, and compared it with strategies which do not use disjunctive constraints. Results: Naive interpretations in which predictions were taken as 100% correct led to poor-quality structures. Interpretations that allow disjunctive constraints are quite robust, and even relatively poor predictions (58% correct) can significantly increase the quality of computed structures (almost halving the RMS error from the known structure). Conclusions: Secondary structure predictions can be used to improve the quality of three-dimensional structure computations. In fact, when interpreted appropriately, imperfect predictions can provide almost as much improvement as perfect predictions in three-dimensional structure calculations. Contact: rba@@@smi.stanford.edu.
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Molecular Biology
- Computer Science Applications
- Computational Theory and Mathematics
- Computational Mathematics