We test our proposed algorithm on two substantially large datasets of gene expression patterns from the yeast Saccharomyces Cerevisiae. The first dataset is obtained from experiments designed to examine the role of Ras2 and Gpa2 in effecting transcriptional changes in the response of yeast cells to glucose, and consists of 5652 genes, each with 24 feature points. Glucose however, has more far-ranging effects on yeast than just the Ras signaling pathway. Besides stimulating the Ras and Gpa2 pathways, at least two additional pathways are affected in yeast as a result of glucose addition. The second set of data is derived from experiments designed to study the roles of the Ras, Snf1, and Sch9 proteins in effecting transcriptional changes as a result of yeast cellular response to glucose. It consists of 5657 genes, each with 75 feature points. We show that our method saturates after a certain number of iterations, resulting in a higher level of biological coherence as a whole and a significantly improved proportion of genes placed into quality clusters. As such, we believe our work to be valuable to cluster analysis, in particularly a detailed study of the cluster members for the purpose of identifying gene regulatory modules and predicting common motif structure.
-Tan, M. P.; Broach, J. R.; Floudas, C. A.; A Novel Mixed-Integer Nonlinear Optimization-Based Clustering Approach: Global Optimum Search with Enhanced Positioning (EP_GOS_Clust) and Determination of Optimal Number of Clusters; 2006; In Preparation
-Floudas, C. A.; Nonlinear and Mixed-Integer Optimization: Fundamentals and Applications; Oxford University Press; 1995
-Floudas, C. A.; Aggarwal, A.; Ciric, A. R.; Global Optimum Search for Non Convex NLP and MINLP Problems; Comp. & Chem. Eng.; 13(10); 1989; pp. 1117-1132
-Tan, M. P.; Broach, J. R.; Floudas, C. A.; Microarray Data Mining: A Novel Optimization-Based Iterative Approach to Uncover Biologically Coherent Structures; 2006; In Preparation