A Bayesian Approach to Mathematical Model Building
Shou-Huan Hsu1, James M. Caruthers2, W. Nicholas Delgass3, Venkat Venkatasubramanian3, Gary E. Blau4, Michael E. Lasinski5, and Orcun Seza6. (1) Department of Chemical Engineering, Purdue University, 480 Stadium Mall Drive, West Lafayette, IN 47906, (2) Chemical Engineering, Purdue University, 480 Stadium Mall Drive, West Lafayette, IN 47907, (3) School of Chemical Engineering, Purdue University, 480 Stadium Mall Drive, West Lafayette, IN 47906, (4) e-Enterprise Center, Discovery Park, Purdue University, West Lafayette, IN 47906, (5) Purdue University, West Lafayette, IN 47906, (6) e-Enterprise,Discovery Park, Purdue University, West Lafayette, IN 47906
Mathematical models are important venues for understanding and predicting the behavior or physicochemical systems. Until recently, the frequency approach to statistical model building was the sole venue for dealing with the scarcity of experimental data and the computational challenge of both identifying adequate nonlinear models and characterizing model parameters. The advent of high throughput experimentation coupled with breakthroughs in Monte Carlo sampling has made it possible to cast the model building problem in a Bayesian Framework which mimics the thinking process of experienced engineers. This paper presents a comprehensive and rigorous approach to fundamental model building which combines the best of the Bayesian Approach and the frequency approach. Initially, Design of Experiments and Exploratory Data Analysis procedures are combined with literature data and the subjective knowledge of the experimentalist to postulate mathematical models, error models, and their associated prior probability distributions. Monte Carlo or Markov Chain Monte Carlo sampling procedures are then used to calculate the posterior model probabilities to discriminate among postulated models or choose new ones. If adequate discrimination cannot be obtained, a novel nonlinear design of experiments strategy is used to generate additional experimental data to improve discrimination and suggest alternative models. Once discrimination is achieved, a nonlinear lack of fit test is introduced to determine model adequacy by simultaneously sampling from the joint posterior mathematical distribution and the error model distribution. Using the best model selected, highest probability density (HPD) intervals are determined for the individual parameters and HPD density regions for all model parameter pairs. A Genetic Algorithm based approach is introduced to design experiments that reduce the uncertainty in the joint posterior probability HPD regions. Finally, a sampling procedure is described to properly represent uncertainties in predictions made from the model. The entire procedure is illustrated with a heterogeneous catalytic model building exercise in which three Langmuir-Hinshelwood models are postulated to describe reaction rate data generated in a differential reactor with four sets of reactor conditions and three input feeds. Prior probabilities are generated from literature references and from fractional factorial experiments. Then new experiments are designed and analyzed using Bayesian methods to discriminate the model and improve parameter estimates. The example illustrates a detailed procedure to conduct model building. Comparisons will be made with other approaches to justify the significant computational burden required to obtain rigorous mathematical models with confidence region based parameter estimates.