TY - JOUR
T1 - Optimal methods for re-ordering data matrices in systems biology and drug discovery applications
AU - Dimaggio, Peter A.
AU - Mcallister, Scott R.
AU - Floudas, Christodoulos A.
AU - Feng, Xiao Jiang
AU - Rabinowitz, Joshua D.
AU - Rabitz, Herschel A.
PY - 2008
Y1 - 2008
N2 - The analysis of large-scale data sets via clustering techniques is utilized in a number of applications. Many of the methods developed employ local search or heuristic strategies for identifying the "best" arrangement of features according to some metric. In this article, we present rigorous clustering methods based on the optimal re-ordering of data matrices. Distinct mixed-integer linear programming (MILP) models are utilized for the clustering of (a) dense data matrices, such as gene expression data, and (b) sparse data matrices, which are commonly encountered in the field of drug discovery. Both methods can be used in an iterative framework to bicluster data and assist in the synthesis of drug compounds, respectively. We demonstrate the capability of the proposed optimal re-ordering methods on several data sets from both systems biology and molecular discovery studies and compare our results to other clustering techniques when applicable.
AB - The analysis of large-scale data sets via clustering techniques is utilized in a number of applications. Many of the methods developed employ local search or heuristic strategies for identifying the "best" arrangement of features according to some metric. In this article, we present rigorous clustering methods based on the optimal re-ordering of data matrices. Distinct mixed-integer linear programming (MILP) models are utilized for the clustering of (a) dense data matrices, such as gene expression data, and (b) sparse data matrices, which are commonly encountered in the field of drug discovery. Both methods can be used in an iterative framework to bicluster data and assist in the synthesis of drug compounds, respectively. We demonstrate the capability of the proposed optimal re-ordering methods on several data sets from both systems biology and molecular discovery studies and compare our results to other clustering techniques when applicable.
KW - "In silico" synthesis
KW - Data clustering
KW - Molecular scaffold
UR - http://www.scopus.com/inward/record.url?scp=51349166621&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=51349166621&partnerID=8YFLogxK
U2 - 10.1142/s1793048008000605
DO - 10.1142/s1793048008000605
M3 - Article
AN - SCOPUS:51349166621
SN - 1793-0480
VL - 3
SP - 19
EP - 42
JO - Biophysical Reviews and Letters
JF - Biophysical Reviews and Letters
IS - 1-2
ER -