Mixture Model On Graph: A Bayesian Approach to Understanding Metabolic Effects of Bioengineering Processes

Tuesday, November 10, 2009: 1:10 PM
Cheekwood H (Gaylord Opryland Hotel)

Josselin Noirel, The University of Sheffield, Sheffield, United Kingdom
Sy Ow, ChELSI, BESG, Chemical and Process Engineering, The University of Sheffield, Sheffield, United Kingdom
J Pandhal, ChELSI, BESG, Chemical and Process Engineering, The University of Sheffield, Sheffield, United Kingdom
G Sanguinetti, ChELSI, BESG, Chemical and Process Engineering, The University of Sheffield, Sheffield, United Kingdom
PC Wright, ChELSI, BESG, Chemical and Process Engineering, The University of Sheffield, Sheffield, United Kingdom

Systems-level understanding in systems biology chiefly rests upon the generation of high-throughput data and the system-scale network description of elements connected by functional relationships (protein-protein interactions, reaction chains, regulation). A particularly useful layer in the systems biology framework is the proteome, the expressed protein content of the cell.

Proteomics is the study of the proteome and it has come to maturity by producing high-throughput data. The interpretation and the organisation of colossal amounts of data require specially devised techniques in proteomics. Although mass-spectrometry-based proteomics has greatly improved over recent years, it usually generates sparse data (typically 10-20% of the theoretical proteome vs near 100% for the transcriptome), which are therefore more complex to analyse. iTRAQ (isobaric tags for relative and absolute quantitation) uses isotopic labelling which is then observed in the low mass region of a tandem fragmentation mass spectrum (mass/charge ratio of 113-121 Da) to quantify the peptides from trypsin-digested proteomes extracted from cells grown in different experimental conditions. The development of the enhanced fragmentation modes on 3-dimensional quadrupole ion trap mass spectrometers (MS) allows one to use a ion trap during MS/MS mode to relatively quantify the peptides, albeit with more missing peaks than one would typically expect from time of flight tandem mass spectrometers. Ion trap MS is typically not the method of choice to examine low mass ions in MS/MS spectra due to the "below 1/3rd cut-off rule" that normally means that fragmentation ions below several hundred Da are not measured.

The interpretation of the quantitations at the protein level calls for a least-square minimisation procedure that can handle missing peaks in the ion trap spectra. At the systems level, the network-based techniques only recently started to attract attention from the proteomics community and are still under development. Our approach "Mixture model on graphs" (MMG) is an attempt to tackle this problem and to help the integration of the typically sparse proteomic datasets with biological-network information, such as that provided by KEGG or MetaCyc. MMG is based on a Bayesian model of down- and up-regulation that is informed by the topology of biological networks through a conditional prior. This conditional prior relies on the coherent behaviour of enzyme expression along metabolic pathways that has resulted from natural selection. The coherent response of the enzymes along a pathway can be seen as a solution to the problem of optimising fluxes and responses. We shall explain the details of the method and the assumptions upon which it relies, and how it can help to devise hypotheses in the context of quantitative proteomics. An experiment carried out on an Escherichia coli synthetic biology construct expressing a light responsive circuit allows us to show that the systems approach manages to extract meaningful information from the proteomic data that cannot be recovered by naive thresholding of the data. We also present a validation of MMG through bootstrapping on Saccharomyces cerevisiae.

Extended Abstract: File Not Uploaded