- 4:45 PM
582f

Optimization Based Automated Curation of Metabolic Reconstructions

Vinay S. Kumar1, Madhukar S. Dasika2, Priti Pharkya, Anthony Burgard3, and Costas D. Maranas4. (1) Department of Industrial Engineering, The Pennsylvania State University, 147A, Fenske Lab, University Park, PA 16802, (2) Department of Chemical Engineering, Penn State University, 147A Fenske Lab, University Park, PA 16802, (3) Genomatica, Inc., 5405 Morehouse Drive Suite 210, San Diego, CA 92121, (4) Department of Chemical Engineering, The Pennsylvania State University, 112, Fenske Lab, University Park, PA 16802

Currently, there exists tens of different microbial and eukaryotic metabolic reconstructions (e.g., Escherichia coli, Saccharomyces cerevisiae, Bacillus subtilis ) with many more under development. All of these reconstructions are inherently incomplete with some functionalities missing due to the lack of experimental and/or homology information. A key challenge in the automated generation of genome-scale reconstructions is the elucidation of these gaps and the subsequent generation of hypotheses to bridge them. In this work we use two different types of information to identify and eliminate network gaps.

The first one is the identification of “unreachable” reactions disconnected from the rest of the network and thus incapable of carrying flux under any uptake conditions. These reactions which we refer to as blocked cannot carry flux because either one of the reactants cannot be formed or one of the products cannot be further converted or transported out of the cell. We developed an optimization framework that identifies how to minimally fix the maximum number of these inconsistencies by appending to the organism's stoichiometric reconstruction, reactions from a multi-species universal database of reactions (~3,900 reactions) which we have developed previously in our group. Reactions whose corresponding genes show homology to one or more genes on the genome of the curated model are preferentially selected. Preliminary results from our model refinement pipeline reveal that 49% of the blocked reactions in Escherichia coli can become “reachable” following the incorporation of 72 additional reactions from the databases. We also find that the number of added reactions required to unblock a reaction was varies from as few as one to as many as 16.

Subsequently the gap-filled model is further refined by comparing in silico predictions to growth phenotypes from experimental observations. Inconsistencies are manifested as in silico growth predictions when, in vivo, the environmental and/or genetic perturbation is lethal and vice versa. To resolve these we identify what additional reaction reactions must be added to the model to resolve inconsistencies where the model predicts no growth but experiment shows growth while at the same time preserving all no growth predictions that are in agreement with experiment. Similarly, we identify which reaction to eliminate under certain conditions (e.g., anaerobic/aerobic, presence of certain substrates) to resolve inconsistencies where the model predicts growth but the experiment shows lethality while at the same time preserving all growth predictions that are in agreement with experiment. The proposed frameworks are demonstrated on stochiometric reconstructions for E. coli, S. cerevisiae and B. subtilis.