TY - JOUR

T1 - Solving and analyzing side-chain positioning problems using linear and integer programming

AU - Kingsford, Carleton L.

AU - Chazelle, Bernard

AU - Singh, Mona

N1 - Funding Information:
The authors would like to thank Amy Keating, Jessica Fong, Gevorg Grigoryan, Elena Nabieva, Robert Osada, Elena Zaslavsky and the reviewers for insightful comments on this manuscript. M.S. thanks the NSF for PECASE grant MCB-0093399 and DARPA for grant MDA972-00-1-0031. B.C. thanks the NSF for grant CCR-998817, DARPA for ARO grant DAAH04-96-1-0181 and the NEC Research Institute.

PY - 2005/4/1

Y1 - 2005/4/1

N2 - Motivation: Side-chain positioning is a central component of homology modeling and protein design. In a common formulation of the problem, the backbone is fixed, side-chain conformations come from a rotamer library, and a pairwise energy function is optimized. It is NP-complete to find even a reasonable approximate solution to this problem. We seek to put this hardness result into practical context. Results: We present an integer linear programming (ILP) formulation of side-chain positioning that allows us to tackle large problem sizes. We relax the integrality constraint to give a polynomial-time linear programming (LP) heuristic. We apply LP to position side chains on native and homologous backbones and to choose side chains for protein design. Surprisingly, when positioning side chains on native and homologous backbones, optimal solutions using a simple, biologically relevant energy function can usually be found using LP. On the other hand, the design problem often cannot be solved using LP directly; however, optimal solutions for large instances can still be found using the computationally more expensive ILP procedure. While different energy functions also affect the difficulty of the problem, the LP/ILP approach is able to find optimal solutions. Our analysis is the first large-scale demonstration that LP-based approaches are highly effective in finding optimal (and successive near-optimal) solutions for the side-chain positioning problem.

AB - Motivation: Side-chain positioning is a central component of homology modeling and protein design. In a common formulation of the problem, the backbone is fixed, side-chain conformations come from a rotamer library, and a pairwise energy function is optimized. It is NP-complete to find even a reasonable approximate solution to this problem. We seek to put this hardness result into practical context. Results: We present an integer linear programming (ILP) formulation of side-chain positioning that allows us to tackle large problem sizes. We relax the integrality constraint to give a polynomial-time linear programming (LP) heuristic. We apply LP to position side chains on native and homologous backbones and to choose side chains for protein design. Surprisingly, when positioning side chains on native and homologous backbones, optimal solutions using a simple, biologically relevant energy function can usually be found using LP. On the other hand, the design problem often cannot be solved using LP directly; however, optimal solutions for large instances can still be found using the computationally more expensive ILP procedure. While different energy functions also affect the difficulty of the problem, the LP/ILP approach is able to find optimal solutions. Our analysis is the first large-scale demonstration that LP-based approaches are highly effective in finding optimal (and successive near-optimal) solutions for the side-chain positioning problem.

UR - http://www.scopus.com/inward/record.url?scp=16344382546&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=16344382546&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/bti144

DO - 10.1093/bioinformatics/bti144

M3 - Article

C2 - 15546935

AN - SCOPUS:16344382546

SN - 1367-4803

VL - 21

SP - 1028

EP - 1036

JO - Bioinformatics

JF - Bioinformatics

IS - 7

ER -