ABSTRACT for AIChE 2016 Spring Meeting
Stochastic Modeling using Virtual Training Sets
James C Cross III
The MathWorks, Inc.
In a typical process plant, instrumentation data are archived, providing a rich set of structured data that can be exploited for system identification, efficiency optimization, and anomaly detection. Combining these data with process simulation results, plant operating heuristics, and economic parameters, a detailed operating model affording the prediction of output and associated expenses can be built to guide planning decisions. This is fairly common practice.
A more complicated situation arises when an organization must decide how to operate a portfolio of assets producing the same products, for example to maximize profit over a known time horizon for a specified total output level. Examples include resource extraction from oil and gas wells, production of electricity from a pool of power generators, and water resources management. While most plants and other process equipment are designed to operate consistently at or near design capacity, market fluctuations dictate the need for part-load strategies and accordingly these types of analyses, as has happened in recent times with the downturn in oil prices.
To support planning, a single answer from a predictive model is inadequate. It can stand as a reference scenario, however diligent risk management demands that the probability of alternative answers, such as would arise from uncertainties associated with exogenous variables, be computed and compared to the reference. One could conceptually define a collection of scenarios, and use the detailed models for all of the plants in the asset pool to compute the global optimum for each scenario. While this is appealing, it is invariably impractical the level of complexity of accurate plant models, the duration of planning horizons, and the multitude of operating constraints combine to make this an exceptionally burdensome computational challenge. Is there a way around this?
The application of machine learning methodologies to real data sets has seen tremendous growth in the past decade. In this work a machine learning approach is used to extract important relationships from the optimized operating scenarios calculated by the detailed optimization model. The simplified model that results from using these virtual training sets captures the most influential factors embedded in the detailed model, and can be used to run simulations across a statistically significant number of scenarios in a reasonable time.
This talk elaborates the approach and offers a framework. The methodology is motivated through use of a simple illustrative example, which reveals both the advantages and limitations of the approach. Select open questions and opportunities for further work are highlighted.