463221 M.R.Q.P.: Prediction of Final Batch Quality Using a Multi-Resolution Framework

Tuesday, November 15, 2016: 1:21 PM
Monterey I (Hotel Nikko San Francisco)
Geert Gins, Dept. of Chemical Engineering, KU Leuven, Ghent, Belgium, Jan Van Impe, Chemical Engineering, Katholieke Universiteit Leuven, Leuven, Belgium and Marco Reis, Department of Chemical Engineering, University of Coimbra, Coimbra, Portugal

Introduction

Global competition, legislation, and consumer demands impose ever stricter targets for efficiency, safety, profitability, and reduction of environmental impact on process industry. In this context, accurate and quick assessment of product quality is a critical task as it speeds up production by reducing the delay incurred by quality analysis, enabling either quick response to deviating product quality or faster product release.

The issue of quality assessment is most pronounced for batch processes, which are widely used for the production of goods with a high added value such as pharmaceuticals, specialty chemicals, etc. [1] Batch processes are non-stationary by design, commonly owing to physical and/or (bio)chemical limitations on the processes. Online measurement of instantaneous quality is difficult or even impossible [2], as quality of a batch can often only be sensibly defined after batch completion. Because all batch are independent from each other, quality must also be assessed after completion of each run. As a result, slow quality assessment can represent a substantial delay in production as it must be performed after. This is in large contrast with continuous processes, where process and quality variations are much smaller and quality measurements must not be carried out as often.

Mathematical models provide a cheap and fast alternative for quality estimation compared to lab analysis [3]. For batch processes, models based on Multiway Partial Least Squares (MPLS)[2] are the most popular [4]. These PLS models use profiles of online measurements to predict the final quality. Typically, the selected sensors are included in the PLS model at their highest available sampling rate, which is determined during commisioning of the control system and not guaranteed to be optimal for quality prediction as different variables carry information on different time scales [5]. Although PLS models are capable of efficiently dealing with measurement noise as well as the data redundancy present in batch data [6,7], a priori removal of uncorrelated and/or noisy model inputs yields better prediction accuracy [8].

This work presents a Multi-Resolution Quality Prediction (MRQP) framework for batch quality estimation. Based on the insight that batch data inherently contains structured correlation (namely the time evolution of each sensor), MRQP searches for the optimal resolution for each variable’s measurement profile with respect to batch-end quality.

Methodology

The presented Multi-Resolution Quality Prediction (MRQP) framework for batch quality first employs a multi-resolution wavelet decomposition [9] to obtain a set of approximations at different resolution of each variable’s measurement profile.

Next, an adapted (robust) version of Sequential Forward Floating Selection (SFFS)[10] is used to select the set of variable/resolution combinations that yields the best prediction of the batch quality. In this step, each different resolution of each variable is treated as a separate candidate input for an MPLS model. To improve the robustness of optimal variable/resolution selection, multiple crossvalidations are performed to assess model performance, and variability over these repetitions is exploited.

This combination of simultaneous optimal variable and resolution selection results in a multi-resolution MPLS batch quality prediction model that is theoretically guaranteed to perform at least as well as the single-resolution MPLS prediction model (i.e., the model employing all sensors at the highest sample rate). The MRQP approach yields a model with improved accuracy and stability.

Case Studies

The proposed MRQP is tested in three case studies.

The first case study consists of a simulated batch process for growth of Saccharomyces cerevisiae [11]. When comparing the performance of the “minimally optimal” models (i.e. models for which the addition of an extra variable/resolution combination to the input set does not improve the crossvalidation performance singificantly), the multi-resolution model achieves a median crossvalidation error 3.2% lower than the single-resolution model. When the overall lowest crossvalidation error is compared, the multi-resolution models outperforms the single-resolution model by 3.5%. In both situations, model complexity was similar.

The second case study is a simulated fermentation process for the production of penicillin [12,13], a widely used benchmark for Statistical Process Control. Here, the median error of the minimally optimal multi-resolution model is 2.2% lower than that of the corresponding single-resolution model. The overall lowest median crossvalidation error for the MRQP model is 1.9% than for the single-resolution approach. In both situations, the better MRSP performance requires more latent variables in the MPLS model, but the dimensionality of the input matrix is substantially lower (4 vs. 2048, and 5 vs. 3072), making the multi-resolution model much easier to interpret.

While the first two case studies were simulated, the final case study is taken from an industrial polymerization process [7] to demonstrate the practical applicability of the MRQP. The minimally optimal MRQP model outperforms its single-resolution equivalent by 7.6%, but requires a slightly larger input space dimensionality. The overall best crossvalidation performance of the multi-resolution model is 5.2% better than that of the best single-resolution MPLS model. In this situation, the multi-resolution model requires fewer latent variables and a lower input space dimensionality than the single-resolution model.

In all case studies, it was observed that less significant variables are typically included in the model at coarser resolutions. This has the benefit that the dimensionality of the input space is only increased a little, reducing the potential for spurious correlations to influence the model. Hence, the MRQP framework results in models that are inherently more robust towards the selection of too many predictors.

Conclusions

This work introduced Multi-Resolution Quality Prediction (MRQP) for batch processes. Instead of employing all available sensors at their maximal available sample rate, MRQP exploits the presence of structured correlation in the batch data, namely between subsequent time points.

Hereto, a multi-resolution framework is first used to compute approximations of varying resolution for each variable, followed by an optimal input selection approach to identify the optimal set of variable/resolution combinations.

The MRQP model for quality prediction is theoretically guaranteed to be at least as accurate as the single-resolution model. In three case studies, however, performance improvements of 1.9%-7.6% were observed, often in combination with more parsimonious model structures. In addition, multi-resolution models were found to be more robust with respect to the selection of too many predictors. The presented MRQP framework is readily extendable towards multi-phase processes, as well as towards other regression techniques such as multiway, dynamic, or kernel models [14,15,16].

In conclusion, MRQP for multi-resolution batch-end quality prediction offers significant benefits over traditional single-resolution approaches.

Acknowledgements

Work supported in part by Project PFV/10/002 (OPTEC Optimization in Engineering Center) of the Research Council of the KU Leuven, Project IAP VII/19 (DYSCO Dynamical Systems, Control and Optimization) of the Belgian Program on Interuniversity Poles of Attraction initiated by the Belgian Federal Science Policy Office, and the SCORES4CHEM knowledge platform (www.scores4chem.be). Marco S. Reis acknowledges financial support through project PTDC/QEQ-EPS/1323/2014 co-financed by the Portuguese FCT and European Union’s FEDER through the program “COMPETE 2020”. The authors assume all scientific responsibility.

References

  1. S. Stubbs, J. Zhang, J. Morris (2013) Multiway Interval Partial Least Squares for batch process performance monitoring. Ind Eng Chem Res 52:12399-12407
  2. P. Nomikos, J.F. MacGregor (1995) Multi-way partial least squares in monitoring batch processes. Chemom Intell Lab Syst 30:97-108
  3. S.J. Qin, Y. Zheng (2013) Quality-relevant and process-relevant fault monitoring with concurrent Projection to Latent Structures. AIChE J 59(2):496-504
  4. N. Lu, F. Gao (2005) Stage-based online quality control for batch processes. Ind Eng Chem Res 45:2272-2280
  5. M.S. Reis, P.M. Saraiva (2006) Multiscale statistical process control using wavelet packets. AIChE J 54(9):2366-2378
  6. R. Bro (1996) Multiway calibration. Multilinear PLS. J Chemometr 10:47-61
  7. G. Gins, B. Pluymers, I.Y. Smets, J. Espinosa, J.F.M. Van Impe (2011) Prediction of batch-end quality for an industrial polymerization process. LNCS 6870:314-328
  8. J. Trygg, S. Wold (2002) Orthogonal projection to latent structures (O-PLS). J Chemometr 16:119-128
  9. I. Daubechies (1992) Ten lectures on wavelets. SIAM, Philadelphia (PA, USA)
  10. P. Pudil, J. Novovicová, J. Kittler (1994) Floating search methods in feature selection. Pattern Recogn Lett 15:1119-1125
  11. F. Lei, M. Rotbøll, S.B. Jørgensen (2001) A biochemically structured model for Saccaromyces cerevisiae. J Biotechnol 88:205-221
  12. G. Birol, C. Ündey, A. Cinar (2002) A modular simulation package for fed-batch fermentation: Penicillin production. Comput Chem Eng 26:1152-1565
  13. J. Van Impe, G. Gins (2015) An extensive reference dataset for fault detection and identification in batch processes. Chemomter Intell Lab Syst 148:20-31
  14. D.J. Louwerse, A.K. Smilde (2000) Multivariate statistical process control of batch processes based on three-way models. Chem Eng Sci 55(7):1225-1235
  15. J. Chen, K.-C. Liu (2002) On-line batch process monitoring using dynamic PCA and dynamic PLS models. Chem Eng Sci 75:63-75
  16. R. Rosipal, L.J. Trejo (2001) Kernel Partial Least Squares regression in Reproducing Kernel Hilbert Space. J Mach Learn Res 2:97-123

Extended Abstract: File Not Uploaded
See more of this Session: Big Data Analytics in Chemical Engineering
See more of this Group/Topical: Computing and Systems Technology Division