Thursday, November 8, 2007 - 2:35 PM
598f

Relative Expression Reversals For Disease Diagnosis And Biological Discovery

Nathan D. Price, Chemical and Biomolecular Engineering, University of Illinois, Urbana-Champaign, 600 S. Mathews Ave., 114 RAL, Box C-3, Urbana, IL 61801

The identification from global data sets of stable and predictive patterns of relative expression reversals offers a simple, yet powerful approach to disease diagnostics and biological discovery. We have utilized this approach to identify a highly-accurate two-gene classifier to differentiate gastrointestinal stromal tumor and leiomyosarcoma, two cancers that have very similar histopathology, but require very different treatment. Our classifier has an estimated accuracy using leave-one-out cross validation (LOOCV) on a set of 68 patient tumors of 98%, and also performed perfectly on an independent test set of 19 additional patients. We have also developed a method for identifying relative expression reversals between a priori defined gene sets that we are terming “Gene Set Expression Reversal Analysis.” Most pathway-level comparisons are based on looking at enrichment of individual pathways between two sample sets (e.g. cancer vs. non-cancer or responsive vs. non-responsive to treatment). Common examples of this approach include GO enrichment and Gene Set Enrichment Analysis. We add to these approaches by also performing relative comparisons between all pairs of gene sets in order to uncover switches between their relative expressions between classes. We used a set of publicly available data comparing mesothelioma and adenocarcinoma as a first example case. We were able to find a set of 7 pathway expression reversals between these diseases, each of which differentiated the two diseases over 90% of the time, and by using a majority vote of the 7 resulted in an estimated accuracy on future data of 99% using LOOCV. Importantly, some of these pathway expression reversals revealed differences that could not be seen by only evaluating pathways individually. In the case of mesothelioma and adenocarcinoma, for example, the EGFR -> MYC signaling pathway was not highly enriched in either disease, but the ratio between the expression of this pathway and the expression of ERBB2 -> FOXO3A signaling pathway was very effective at differentiating the two diseases (>90%). Our work thus far indicates that assessing relative changes between pairs of pathways will yield significant additional insights not found using methods that consider genes or gene sets individually.


Web Page: www.pnas.org/cgi/content/full/104/9/3414