455980 Systems Analysis of Ensemble of Decision Trees for Modeling and Process Control

Monday, November 14, 2016: 3:20 PM
Monterey I (Hotel Nikko San Francisco)
Zelimir Kurtanjek, Faculty of food technology and biotechnology, University of Zagreb, Zagreb, Croatia

Considered are methodologies for systems analysis of processes based on ensemble (forest) of decision tree models. The ensembles are derived from multivariate variables records of numerical and categorical data from plant monitoring systems and can be considered as an application of production plant big data analysis. The ensembles are optimized by the gradient boosting algorithm. Structure and dimensions of individual trees is minimized by the regularization of the objective function defined by minimization of the model prediction error with penalty function for number of trees, tree complexity and crispness of classification. Dimension of the ensembles are minimized by cross validation on training sets and validated on independent training set. Systems properties are evaluated on systemic random search of process variables by the numerical procedure for global Fourier Amplitude Sensitivity Test (FAST) simulation. Derived are the variable importance ranks, confidence intervals of the variables for the ensemble models, global sensitivity and inference of process variable synergism. The methods are demonstrated on a set of data from a communal waste water treatment plant (WWTP). Monitored are 37 chemical, biological and physical variables continuously during the period of two years. In view of process control evaluated are accuracies for prediction of output control variables depending on the length of prediction horizon and inclusion the decision trees into model predictive control (MPC) structure. 

Extended Abstract: File Uploaded
See more of this Session: CAST Rapid Fire Session: III
See more of this Group/Topical: Computing and Systems Technology Division