274126 Incremental Parameter Estimation and Ensemble Kinetic Modeling of Metabolic Networks

Wednesday, October 31, 2012: 3:15 PM
Somerset East (Westin )
Gengjie Jia, Chemical and Bimolecular Engineering, Singapore MIT Alliance, Singapore, Singapore, Gregory N. Stephanopoulos, Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA and Rudiyanto Gunawan, Institute for Chemical and Bioengineering, ETH Zurich, Zurich, Switzerland

The estimation of unknown kinetic parameters from time-series measurements of biological molecules is a major bottleneck in the ODE model building process in systems biology and metabolic engineering (Chou and Voit, 2009). The majority of current estimation methods involve simultaneous (single-step) parameter identification, where model prediction errors are minimized over the entire parameter space. These methods often rely on global optimization methods, such as simulated annealing, genetic algorithms and other evolutionary approaches (Daisuke and Horton, 2006; Gonzalez, et al., 2007; Kikuchi, et al., 2003; Kimura, et al., 2005; Noman and Iba, 2007). The problem of obtaining the best-fit parameter estimates however is typically ill-posed due to issues related with data informativeness, problem formulation and parameter correlation, all of which contribute to the lack of complete parameter identifiability (Srinath and Gunawan, 2010). Not to mention, finding the global minimum of model residuals over highly multidimensional parameter space is challenging and can become prohibitively expensive to perform on a computer workstation, even for tens of parameters.

Here, we consider the modeling of cellular metabolism using the canonical power-law formalism, specifically the generalized mass action (GMA) systems (Savageau, 1969a; Savageau, 1969b). The power-law formalism has many advantages, which have been detailed elsewhere (Chou and Voit, 2009; Voit, 2000).  Notably, power laws have a relatively simple structure that permits algebraic manipulation in the logarithmic scale, but nonetheless is capable of describing essentially any nonlinearity. Regulatory interactions among metabolites can also be described straightforwardly through the kinetic order parameters, establishing an equivalence between structural identification and parametric estimation. However, the number of parameters increases proportionally with the number of metabolites and fluxes, leading to a large-scale parameter identification problem, one where single-step estimation methods often struggle to converge. 

As ODE integrations constitute a major part of the computational cost in the typical ODE parameter estimation (Voit and Almeida, 2004), alternative formulations have been proposed that avoid these integrations either completely (Tsai and Wang, 2005; Voit and Almeida, 2004) or partially (Jia, et al., 2011; Kimura, et al., 2005; Maki, et al., 2002). Particularly, computational cost could be significantly reduced by decomposing the estimation problem into two phases, starting with the calculation of dynamic reaction rates or fluxes from the slopes of concentration data, followed by the least square regressions of kinetic parameters (Bardow and Marquardt, 2004; Goel, et al., 2008; Marquardt, et al., 2006). In this case, the final parameter estimation is done one flux at a time, each involving only a handful of parameters and thus, the global minimum solution can be either computed analytically (for example, when using log-linear power-law flux functions) or determined efficiently. Moreover, as the first estimation phase (flux estimation) depends only on the assumption of the topology of the metabolic network, the flux estimates can subsequently be used to guide the selection of the most appropriate flux functions for the second phase or to detect inconsistencies in the assumed topology of the network separately from the flux equations (Goel, et al., 2008). However, the application of this method requires the number of metabolites to be equal to or larger than that of fluxes, so that the flux estimation can result in a unique solution. Since the reverse situation is more commonly encountered in the typical metabolic networks, a generalization of this incremental estimation approach is the main focus in this study.

As noted above, the new parameter estimation method in this work is built on the concept of incremental identification (Bardow and Marquardt, 2004; Marquardt, et al., 2006) or dynamical flux estimation (DFE) method (Goel, et al., 2008; Voit, et al., 2009), and designed to handle the degrees of freedom associated with the flux estimation from time-course concentration data. Specifically, two parameter estimation formulations are proposed with two different purposes. The first follows the most common formulation in which model prediction errors are minimized and a single “optimal” parameter set is produced. On the other hand, the second formulation is created toward the generation of an ensemble of parameter estimates and models, where each parameter combination will be able to fit the noisy concentration data equally well. The application of the first formulation to GMA models of a generic branched metabolic pathway and L. lactis glycolytic pathway (Voit, et al., 2006) demonstrates the significant improvement in the numerical efficiency offered by the proposed incremental approach, where the computational times were reduced by over two orders of magnitude in comparison with simultaneous (single-step) estimation methods. Meanwhile, in the second formulation, the ensemble of parameter estimates for the two GMA models above are constructed efficiently using an adaptive Monte Carlo and multiple ellipsoid sampling (Zamora-Sillero, et al., 2011). During this construction, the time-varying concentration simulations corresponding to the ensemble of models (or parameter estimates) are also produced. The ability to create such ensemble will necessitate the development of new framework for its use for metabolic engineering, which is a topic of our future work.   

Bardow, A. and Marquardt, W. (2004) Incremental and simultaneous identification of reaction kinetics: methods and comparison, Chemical Engineering Science, 59, 2673-2684.

Chou, I.C. and Voit, E.O. (2009) Recent developments in parameter estimation and structure identification of biochemical and genomic systems, Math Biosci, 219, 57-83.

Daisuke, T. and Horton, P. (2006) Inference of scale-free networks from gene expression time series, Journal of bioinformatics and computational biology, 4, 503-514.

Goel, G., Chou, I.C. and Voit, E.O. (2008) System estimation from metabolic time-series data, Bioinformatics, 24, 2505-2511.

Gonzalez, O.R., et al. (2007) Parameter estimation using Simulated Annealing for S-system models of biochemical networks, Bioinformatics, 23, 480-48. 

Jia, G., Stephanopoulos, G.N. and Gunawan, R. (2011) Parameter estimation of kinetic models from metabolic profiles: two-phase dynamic decoupling method, Bioinformatics, 27, 1964-1970.

Kikuchi, S., et al. (2003) Dynamic modeling of genetic networks using genetic algorithm and S-system, Bioinformatics, 19, 643-650.

Kimura, S., et al. (2005) Inference of S-system models of genetic networks using a cooperative coevolutionary algorithm, Bioinformatics, 21, 1154-1163. 

Maki, Y., et al. (2002) Inference of genetic network using the expression profile time course data of mouse P19 cells, Genome Informatics, 13, 382-383. 

Marquardt, W., Brendel, M. and Bonvin, D. (2006) Incremental identification of kinetic models for homogeneous reaction systems, Chemical Engineering Science, 61, 5404-5420. 

Noman, N. and Iba, H. (2007) Inferring gene regulatory networks using differential evolution with local search heuristics, IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM, 4, 634-647.

Savageau, M.A. (1969a) Biochemical systems analysis. I. Some mathematical properties of the rate law for the component enzymatic reactions, J Theor Biol, 25, 365-369.

Savageau, M.A. (1969b) Biochemical systems analysis. II. The steady-state solutions for an n-pool system using a power-law approximation, J Theor Biol, 25, 370-379.

Srinath, S. and Gunawan, R. (2010) Parameter identifiability of power-law biochemical system models, J Biotechnol, 149, 132-140. 

Tsai, K.Y. and Wang, F.S. (2005) Evolutionary optimization with data collocation for reverse engineering of biological networks, Bioinformatics, 21, 1180-1188.

Voit, E.O. (2000) Computational analysis of biochemical systems : a practical guide for biochemists and molecular biologists. Cambridge University Press, New York.

Voit, E.O. and Almeida, J. (2004) Decoupling dynamical systems for pathway identification from metabolic profiles, Bioinformatics, 20, 1670-1681.

Voit, E.O., et al. (2006) Regulation of glycolysis in Lactococcus lactis: an unfinished systems biological case study, Syst Biol (Stevenage), 153, 286-298.

Voit, E.O., et al. (2009) Estimation of metabolic pathway systems from different data sources, IET Syst Biol, 3, 513-522.

Zamora-Sillero, E., et al. (2011) Efficient characterization of high-dimensional parameter spaces for systems biology, BMC Syst Biol, 5, 142.

Extended Abstract: File Not Uploaded