A data-driven multidimensional visualization technique for process fault detection and diagnosis
Shriram Gajjar, Ahmet Palazoglu
University of California, Davis
Background: Chemical process operations are typically subject to process or operational disturbances. Fault detection and diagnosis are critical to ensure safety, process stability and to maintain optimal operations. For process monitoring, techniques based on first principle models have been studied for more than two decades but their contribution to industrial practice has not been pervasive due to substantial cost and time required to develop a sufficiently accurate model for a complex chemical plant. On the other hand, in a large-scale unit, a Distributed Control System (DCS) collects data from sensor arrays distributed throughout the plant and stores the data at high sampling rates. This data contains information about the underlying process characteristics and can be used for process monitoring. For process monitoring, a plant operator monitors several variables on screens and using his/her experience and domain knowledge, focuses on critical process variables to anticipate and prevent abnormal process operations. In the absence of such experience or domain knowledge, however, more automated techniques are required to inform and advise plant operators. A control chart is one of the primary techniques of statistical process monitoring of real-time data. However, monitoring hundreds of variables simultaneously using univariate control charts is not practical. Moreover, 2-D charts limit our ability to visualize and interpret high-dimensional data.
Prior Work: To overcome this challenge, Inselberg established the concept of parallel coordinates in 1985 []. In plane, parallel coordinates induce duality while in 2-D, they make cluster identification and pattern recognition easier. Albazzaz et al. [] proposed the use of parallel coordinates for multidimensional data visualization of process variables and independent components obtained from Independent component analysis (ICA) of the process dataset. They used Box-Cox transformation and the percentile approach to define the upper and lower control limits of the independent components. The Box-Cox transformation technique can be applied only for positive data values which requires a constant to be added if the set of data contains negative values. The percentile approach simply sorts the vector data of an independent component from the lowest to largest values then takes 99.9% and 0.1% percentile of the data to be the upper and lower limit respectively. Dunia et al. proposed the use of parallel coordinates along with principal component analysis (PCA) to derive empirical control limits for fault detection [] and, in addition to using parallel coordinates, they have also used the Hotelling's T2 and Q statistics for the detection of an out-of-control situation. However, this proposed method appears to be fault specific.
Once the abnormality in the process is detected it is imperative to determine the root cause to repair the fault. The most widely used method for fault isolation is the contribution plot (Miller et al., 1998), which depicts the contribution of each process variable to the monitored statistics. Its effectiveness is limited to simple faults, e.g. sensor and actuator faults (Yoon and MacGregor, 2001; Qin, 2003). Dunia and Qin (1998) proposed a fault identiﬁcation index based on the fault reconstruction square prediction error (FRSPE). The smallest FRSPE is obtained for the reconstructed fault. Raich and Cinar (1997) proposed distance and angle metrics to diagnose process disturbances. Some of the researchers have also worked on data-driven techniques such as qualitative trend analysis (QTA). QTA is a powerful method that provides quick and accurate fault diagnosis. Maurya et al. proposed QTA on the principal components instead of on the original sensors, and showed that the computation time was substantially reduced [].
Proposed method and preliminary results: Motivated by a few limitations of the prior work, this work expands it by proposing a detection algorithm that uses all the measurements available at the plant thus bypassing the need to have prior knowledge of fault or select a priori a set of particular measurements for detection. Traditionally, once the data is projected in principal component (PC) subspace the variance and residual errors are lumped into one statistic viz. Hotelling's T2 and Q statistic respectively. We have developed control limits for each PC based on the normal operating dataset. Such control limits are not fault specific and can be used for fault detection in real-time. For visual process monitoring, our method represents each PC in parallel coordinates along with their control limits. We were successful in reducing fault detection time and improving fault detection rates for data obtained from the Tennessee Eastman benchmark process. Moreover, we have observed that each fault has a unique signature in the parallel coordinate space and such patterns can be used for fault diagnosis. We have investigated machine learning methods for classification of faulty data which can then be used for fault diagnosis in real-time. Modern industrial processes often present a large number of highly correlated process variables, and moreover, the process is manipulated by an intricate network of controller's that provides feedback to the input variables. Thus the impact of a disturbance (or a fault) propagates through to both the input and manipulated variables. In real-time monitoring not only fault detection but also observing how the fault has and will propagate through the process is important for taking the corrective actions. PC scores and loadings are complementary and superimposable. Each variable in the original dataset loads on to different PCs which also reflects in the scores of that variable on the PCs. From the score and loading plots, our preliminary results have shown that without any domain knowledge and only using the information obtained from the data we can show how a fault propagates through the system.
In summary, this work focuses on the use of parallel coordinates for multidimensional visualization using PCA and discusses its accuracy for fault detection, fault diagnosis and fault propagation. The feasibility and validity of the proposed multidimensional PCA visualization is demonstrated through the benchmark Tennessee Eastman process (Downs and Vogel, 1993).