280082 Intelligent Recursive Soft Sensor Adaptation Via Bayesian Outlier Detection and Classification

Tuesday, October 30, 2012: 4:20 PM
325 (Convention Center )
Hector Galicia, Department of Chemical Engineering, Auburn University, Auburn, AL, Qinghua He, Department of Chemical Engineering, Tuskegee University, Tuskegee, AL and Jin Wang, Auburn University, Auburn, AL

Data-driven soft sensors that predict the primary variables of a process by using the secondary measurements have drawn increased research interests recently. Among them, the partial least squares (PLS) based soft sensor is the most commonly used approach for industrial applications. As industrial processes often experience time-varying changes, it is desirable to update the soft sensor model with the new process data once the soft sensor is implemented online. In our previous work [1-2], the recursive reduced-order dynamic PLS (RO-DPLS) soft sensor is developed to provide quality estimates of primary process variables  in the event of large transport delays and time-varying process conditions. Both simulated and industrial case studies of a continuous Kamyr digester showed that a recursive RO-DPLS soft sensor can provide accurate estimates of the extent of reaction (i.e., the Kappa number) and can cope with time-varying process behavior.

Since the focus in [2] was to investigate the properties of different recursive updating schemes and data scaling methods, the industrial datasets were pre-processed to remove all outliers before subjecting them to different experiments. This step was taken since static and recursive PLS algorithms are sensitive to outliers in the dataset [3]. Therefore, outlier detection and handling plays a critical role in the development of the PLS-based soft sensors. Although there exists extensive studies on outlier detection for off-line model building [3-7], outlier detection remains a challenging problem. In addition, the detection of outliers online poses some additional challenges. First, the resources dedicated to develop the soft sensor off-line (e.g. expert knowledge) are not available during the online phase. Second, for online adaptation of soft sensor models, if erroneous readings are used to update the soft sensor model, future predictions from the updated model may deteriorate significantly. Furthermore, the challenge increases since outliers online not only can be erroneous readings, but they could also be normal samples of new process states that represent a new process behavior. These samples are necessary to use to provide the updated model with accurate information of the current process behavior and thus improve the soft sensor adaptation.

In this work, a multivariate approach for online outlier detection based on the squared prediction error (SPE) statistic is developed to improve the automatic adaptation of the soft sensor model. In addition, to differentiate outliers caused by erroneous readings from those caused by process changes, a Bayesian supervisory approach is proposed to analyze and classify the detected outliers. A challenging simulation case study of a continuous Kamyr digester [8] is used to assess the performance of the recursive soft sensor with outlier detection and classification. In this case study a major process disturbance, a wood type change, is simulated. It is worth noting that after this major process change occurs, the process settles in a completely new state, and the normal monitoring indices (i.e., SPE indices) may switch to a different level. Therefore, the thresholds of the online monitoring SPE indices need to be updated as well. Otherwise, the performance of outlier detection will deteriorate considerably. In this work, a robust way to update the monitoring thresholds based on an exponentially moving average (EWMA) filter is proposed. The update strategy is shown in Eqn. (1)



where  denotes the previous monitoring threshold for outlier detection before the update;   the threshold after the update;  the threshold estimated using the reconstructed SPE indices of new measurements. The initial thresholds are determined using historical data under normal operation condition. The parameter  is a tuning parameter which controls how fast the thresholds are updated. In this work, the thresholds are updated for normal operation every 20 samples i.e., no outliers are detected. For this case a relatively conservative setting of  is used (0.9 <  < 0.7). For cases where a process change is detected, i.e., detected outliers are classified as part of a process change, a more conservative setting is used (0.95 <  ). It is also worth noting that usually different settings are used for monitoring the independent and dependent variable spaces via SPEx and SPEy indices. This is necessary due to the differences in variability on their corresponding monitoring indices. Due to the limited number of samples for regular update (i.e. 20 samples) and for process change update (i.e. 5 samples),  is estimated through an empirical way as shown in Eqn. (2).



where   and  are the mean and standard deviation of SPE of the samples used for update;  is a tuning parameter usually around 2~3.

The results obtained from the challenging simulation case study indicate that the recursive soft sensor with outlier detection, classification and threshold update, provides a robust way to address the time-varying nature of industrial processes and to provide an intelligent adaptation of the soft sensor model. In addition, the good performance observed in the simulation case study of the proposed approaches is confirmed by the application to a more challenging industrial case study of an industrial continuous Kamyr digester.


[1] Galicia, H. J.; He, Q. P. & Wang, J. A reduced order soft sensor approach and its application to a continuous digester. Journal of Process Control, 2011, 21, 489-500.

[2] Galicia, H. J.; He, Q. P. & Wang, J. Comparison of the performance of a reduced-order dynamic PLS soft sensor with different updating schemes for digester control. Control Engineering Practice, 2012. Accepted

[3] Hubert, M. and Branden, K.V.  Robust methods for partial least squares regression, Journal of Chemometrics, vol. 17, pp. 537-549, 2003.

[4] Hodge, V. and Austin, J. A survey of outlier detection methodologies.  Artificial Intelligence Review, vol. 22, pp. 85-126, 2004.

[5] Pearson, R. K. Outliers in process modeling and identification. Control Systems Technology, IEEE Transactions on, vol. 10, pp. 55-63, 2002.

[6] Davies, L. and Gather, U. The identification of multiple outliers. Journal of the American Statistical Association, vol. 88, pp. 782-792, 1993.

[7] Jolliffe, I. T. Principal Component Analysis: Springer, 2002.

[8] Wisnewski, P. A., Doyle, F., Kayihan, F. (1997). Fundamental continuous-pulp-digester model for simulation and control. AIChE Journal, 43(12): 3175-3192.


Extended Abstract: File Not Uploaded
See more of this Session: Process Monitoring and Fault Detection II
See more of this Group/Topical: Computing and Systems Technology Division