Precise physical models of protein-DNA interaction from high-throughput data

Justin B. Kinney, Gašper Tkačik, Curtis G. Callan

Research output: Contribution to journalArticlepeer-review

49 Scopus citations


A cell's ability to regulate gene transcription depends in large part on the energy with which transcription factors (TFs) bind their DNA regulatory sites. Obtaining accurate models of this binding energy is therefore an important goal for quantitative biology. In this article, we present a principled likelihood-based approach for inferring physical models of TF-DNA binding energy from the data produced by modern high-throughput binding assays. Central to our analysis is the ability to assess the relative likelihood of different model parameters given experimental observations. We take a unique approach to this problem and show how to compute likelihood without any explicit assumptions about the noise that inevitably corrupts such measurements. Sampling possible choices for model parameters according to this likelihood function, we can then make probabilistic predictions for the identities of binding sites and their physical binding energies. Applying this procedure to previously published data on the Saccharomyces cerevisiae TF Abf1p, we find models of TF binding whose parameters are determined with remarkable precision. Evidence for the accuracy of these models is provided by an astonishing level of phylogenetic conservation in the predicted energies of putative binding sites. Results from in vivo and in vitro experiments also provide highly consistent characterizations of Abf1p, a result that contrasts with a previous analysis of the same data.

Original languageEnglish (US)
Pages (from-to)501-506
Number of pages6
JournalProceedings of the National Academy of Sciences of the United States of America
Issue number2
StatePublished - Jan 9 2007

All Science Journal Classification (ASJC) codes

  • General


  • Binding energy
  • Likelihood
  • Mutual information
  • Transcription factor


Dive into the research topics of 'Precise physical models of protein-DNA interaction from high-throughput data'. Together they form a unique fingerprint.

Cite this