Optimal methods for re-ordering data matrices in systems biology and drug discovery applications

Peter A. Dimaggio, Scott R. Mcallister, Christodoulos A. Floudas, Xiao Jiang Feng, Joshua D. Rabinowitz, Herschel A. Rabitz

Research output: Contribution to journalArticle

1 Scopus citations

Abstract

The analysis of large-scale data sets via clustering techniques is utilized in a number of applications. Many of the methods developed employ local search or heuristic strategies for identifying the "best" arrangement of features according to some metric. In this article, we present rigorous clustering methods based on the optimal re-ordering of data matrices. Distinct mixed-integer linear programming (MILP) models are utilized for the clustering of (a) dense data matrices, such as gene expression data, and (b) sparse data matrices, which are commonly encountered in the field of drug discovery. Both methods can be used in an iterative framework to bicluster data and assist in the synthesis of drug compounds, respectively. We demonstrate the capability of the proposed optimal re-ordering methods on several data sets from both systems biology and molecular discovery studies and compare our results to other clustering techniques when applicable.

Original languageEnglish (US)
Pages (from-to)19-42
Number of pages24
JournalBiophysical Reviews and Letters
Volume3
Issue number1-2
DOIs
StatePublished - 2008

All Science Journal Classification (ASJC) codes

  • Biophysics
  • Structural Biology
  • Molecular Biology

Keywords

  • "In silico" synthesis
  • Data clustering
  • Molecular scaffold

Fingerprint Dive into the research topics of 'Optimal methods for re-ordering data matrices in systems biology and drug discovery applications'. Together they form a unique fingerprint.

  • Cite this