**Title:** Using Archived Process Data to Improve First
Principal Models

**Author/Presenter:** Peter J. Ryan, Ph.D., P.E.

**Author/Presenter email:** peter.ryan@responsepc.com

**Company:** Response Process Consulting LLC

**Motivation**

While most continuous processes have “first principal” models to support activities such as developing mass and energy balances and preparing economic analysis' for debottlenecking projects, these models typically lack the granularity to describe Key Process Indicators (KPI's) that define product specifications. For example, product color is a specification that is difficult to include in a steady-state simulation because it is not known a priori what process variables influence color, and what is the mechanism of this interaction. A novel approach is presented that uses process models based on archived process data to find relationships between process variables and product specifications. Once these relationships are established, the first principal models are supplemented to include these new process insights.

**Approach**

All manufacturing sectors, continuous or batch, gather process data for archiving and analysis purposes. Often the amount of data gathered, and the quality of data, makes it difficult to use this resource effectively. Major drawbacks in the quality of the archived data include:

· missing data

· noise, poor signal-to-noise ratios

· correlation in the data

· accuracy

· precision

Latent variable methods are capable of developing process models based on this resource. Latent variable models such as Principal Component Analysis (PCA) and Partial Least Squares (PLS) reduce the dimensionality of the data space such that relationships between process variables and KPI's can be established even in the presence of correlated data. This type of machine learning can be used to examine large historical process data sets and determine what process variables influence a KPI, and what the pattern of this interaction is. This information is then added to the first principal model as a mechanistic process. If, for example, the product color is determined to be influenced on the temperature of a specific heat exchanger, a mechanistic process is added to the steady state model that represents this relationship.

** **

**Results**

Archived process data from a**
**commodity chemical process was used to develop a PLS model (Figure 1). The
quality vector used by the PLS model was the product color specification.
Outliers in the archived process data were detected using the Hotelling T^{2}
and Squared Prediction Error (SPE) metrics. The outliers were examined using
contribution charts that showed the individual contributions of the upstream
process variables on the KPI. Both T^{2} and SPE contribution charts
were examined. Based on a scores contribution analysis, the results suggest
that seven upstream process variables influence the color of the product.
These upstream process variables are the average catalyst bed temperature, reactor
coolant temperature, cooler temperature, absorber recirculation rate, a column
feed rate, and a composite process variable. The mechanism proposed to capture
this interaction was a linear combination of the seven process variables, based
on the relative average contributions of each process variable to the total
score of the observations used to establish the interaction mechanism.

With the enhanced steady state model, it is now possible to include KPI terms that are dependent on upstream process variables. The process interactions captured in the enhanced steady state model make it possible to include impacts on product specifications in the economic analysis of debottlenecking and optimization projects. These methods can also be used to improve the granularity of dynamic batch models using a similar methodology.

**Extended Abstract:**File Uploaded

See more of this Group/Topical: Topical A: 2

^{nd}Big Data Analytics