Statistical characterization of protein ensembles

Diego Rother, Guillermo Sapiro, Vijay Pande

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

When accounting for structural fluctuations or measurement errors, a single rigid structure may not be sufficient to represent a protein. One approach to solve this problem is to represent the possible conformations as a discrete set of observed conformations, an ensemble. In this work, we follow a different richer approach and introduce a framework for estimating probability density functions in very high dimensions and then apply it to represent ensembles of folded proteins. This proposed approach combines techniques such as kernel density estimation, maximum likelihood, cross validation, and bootstrapping. We present the underlying theoretical and computational framework and apply it to artificial data and protein ensembles obtained from molecular dynamics simulations. We compare the results with those obtained experimentally, illustrating the potential and advantages of this representation.

Original languageEnglish (US)
Pages (from-to)42-55
Number of pages14
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume5
Issue number1
DOIs
StatePublished - Jan 2008
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Biotechnology
  • Genetics
  • Applied Mathematics

Keywords

  • Bayesian networks
  • Bootstrapping
  • Cross validation
  • Density estimation
  • Graphical models
  • Maximum likelihood
  • Protein ensembles

Fingerprint

Dive into the research topics of 'Statistical characterization of protein ensembles'. Together they form a unique fingerprint.

Cite this