- 9:30 AM
505d

Systematic Identification of Conserved Metabolites in Gc-MS Data for Metabolomics and Biomarker Discovery

Mark Styczynski1, Joel Moxley1, Lily V. Tong2, Jason L. Walther3, and Gregory N. Stephanopoulos3. (1) Chemical Engineering, MIT, 66-264 MIT 77 Massachusetts Ave, Cambridge, MA 02139, (2) Chemical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave., 56-439, Cambridge, MA 02139, (3) Department of Chemical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave., 56-439, Cambridge, MA 02139

The goal of metabolomics – the metabolite analog of genomics and proteomics – is the measurement of concentrations (or “metabolite profiles”) of as many cellular metabolites as possible, usually with applications to functional genomics. Certain aspects of metabolomics suggest that exhaustive metabolite profiling may be possible: the number of known metabolites present in many organisms (e.g., yeast) is tenfold to hundredfold fewer than the number of genes or proteins1-3, and the cost of measuring these metabolites is by comparison significantly lower. To date, the coupling of metabolomic data with other cell-wide data has yielded valuable insight into underlying biochemical processes and has contributed to numerous advances in the area of functional genomics4-6. However, obstacles to exhaustive metabolite profiling persist, one of the most significant being the chemical diversity of metabolites. Unlike DNA or proteins, metabolites do not adhere to a subunit-based chemistry, so assaying for many metabolites (with many chemistries) simultaneously is difficult. Gas chromatography-mass spectrometry (GC-MS) is one method frequently used to assay for a variety of metabolites, and the aim of this work is to improve the downstream analysis of this GC-MS data independent of upstream experimental protocols.

Analysis of metabolomic profiling data from GC-MS measurements usually relies upon reference libraries of metabolite mass spectra to structurally identify and track metabolites. In general, techniques to enumerate and track unidentified metabolites are non-systematic and require manual curation. Here we present SpectConnect, a method and software implementation freely available at http://spectconnect.mit.edu, that can systematically detect components that are conserved across samples without the need for a reference library or manual curation. We validate this approach by correctly identifying the components in a known mixture and the discriminating components in a spiked mixture. We demonstrate an application of this approach with a brief analysis of the Escherichia coli metabolome. We also present recent results of our efforts to better characterize the metabolome of Saccharomyces cerevisiae using SpectConnect.

1Forster, J., Famili, I., Fu, P., Palsson, B.O. & Nielsen, J. Genome-scale reconstruction of the Saccharomyces cerevisiae metabolic network. Genome Res 13, 244-253 (2003).

2Kanehisa, M. & Goto, S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28, 27-30 (2000).

3Mewes, H.W. et al. Overview of the yeast genome. Nature 387, 7-65 (1997).

4Raamsdonk, L.M. et al. A functional genomics strategy that uses metabolome data to reveal the phenotype of silent mutations. Nat Biotechnol 19, 45-50 (2001).

5Weckwerth, W., Loureiro, M.E., Wenzel, K. & Fiehn, O. Differential metabolic networks unravel the effects of silent plant phenotypes. Proc Natl Acad Sci U S A 101, 7809-7814 (2004).

6Hirai, M.Y. et al. Integration of transcriptomics and metabolomics for understanding of global responses to nutritional stresses in Arabidopsis thaliana. Proc Natl Acad Sci U S A 101, 10205-10210 (2004).



Web Page: spectconnect.mit.edu