## Abstract

The task of computing molecular structure from combinations of experimental and theoretical constraints is expensive because of the large number of estimated parameters (the 3D coordinates of each atom) and the rugged landscape of many objective functions. For large molecular ensembles with multiple protein and nucleic acid components, the problem of maintaining tractability in structural computations becomes critical. A well-known strategy for solving difficult problems is divide-and-conquer. For molecular computations, there are two ways in which problems can be divided: (1) using the natural hierarchy within biological macromolecules (taking advantage of primary sequence, secondary structural subunits and tertiary structural motifs, when they are known); and (2) using the hierarchy that results from analyzing the distribution of structural constraints (providing information about which substructures are constrained to one another). In this paper, we show that these two hierarchies can be complementary and can provide information for efficient decomposition of structural computations. We demonstrate five methods for building such hierarchies-two automated heuristics that use both natural and empirical hierarchies, one knowledge- based process using both hierarchies, one method based on the natural hierarchy alone, and for completeness one random hierarchy oblivious to auxiliary information-and apply them to a data set for the procaryotic 30S ribosomal subunit using our probabilistic least squares structure estimation algorithm. We show that the three methods that combine natural hierarchies with empirical hierarchies create decompositions which increase the efficiency of computations by as much as 50-fold. There is only half this gain when using the natural decomposition alone, while the random hierarchy suggests that a speedup of about five can be expected just by virtue of having a decomposition. Although the knowledge-based method performs marginally better, the automatic heuristics are easier to use, scale more reliably to larger problems, and can match the performance of knowledge- based methods if provided with basic structural information.

Original language | English (US) |
---|---|

Pages (from-to) | 409-422 |

Number of pages | 14 |

Journal | Journal of Computational Biology |

Volume | 5 |

Issue number | 3 |

DOIs | |

State | Published - 1998 |

## All Science Journal Classification (ASJC) codes

- Modeling and Simulation
- Molecular Biology
- Genetics
- Computational Mathematics
- Computational Theory and Mathematics

## Keywords

- Distance geometry
- Hierarchical decomposition
- Molecular structure
- Parallelism
- Probabilistic least squares