Performance
Monitoring of Industrial Plant Alarm Systems
Using Event Correlation Analysis
Tsutomu Takai and Masaru Noda
Abstract Alarm systems
are essential for ensuring plant safety and effective operations, and the
management efforts aiming at maintaining and improving alarm systems have
recently intensified in process industries. The alarm management currently in place for
existing plants is basically a CAPDo approach that
begins by evaluating the alarm system performance, and the result is quite
important for finding issues to be effectively maintained and improved. The conventional methods for evaluating
alarm rates, alarm and event distributions, standing alarm times, etc. in
quantity, are still a long way away from effectively evaluating alarm systems,
because they do not evaluate each alarm as a signal requiring operator
attention. The Engineering
Equipment & Materials Users' Association (EEMUA, 2007) says that every
alarm presented to an operator should be useful and relevant to the
operator. Thus, the relationship
between an alarm and the operator response is thought of as a new key
performance indicator (KPI) of the performance of an alarm system. Takai et al. (2010) proposed an evaluation
method focusing on the relationship between them using an operator
questionnaire. A questionnaire is reasonable at the working-level of plant
operations, but questionnaire-based evaluation is often subjective, and may
contain bias arising from the format of the questionnaires or from the individual
respondents. We propose a new KPI for
evaluating the alarm system performance and a calculation method using an event
correlation analysis (Nishiguchi and Takai, 2010) that is a data mining method
to quantify the degree of similarity and time lag between two events using the
cross correlation function, from the event log data, which is composed of
discrete alarms and operator actions at the time they occur. Event
pairs separated by consistent time intervals are considered related in the
event correlation analysis, since the length of the time lags is determined by
factors such as the process dynamics and operator reaction time. A similarity
measure between all the event pairs is calculated from the event log data along
with the probability distribution of the correlation regarding the independent
event pairs. The similarities and intervals for all the combinations between
event pairs are calculated, and the groups with highly related events are
identified by the pair-wise similarities using the hierarchical clustering
method. Then, each alarm is assayed
if the based alarm initiates the required corrective actions after the alarm
occurs in the group. The relevant
alarm rate is defined as the ratio of the number of relevant alarms from all the
alarms. The effectiveness of the proposed method was validated with actual
plant event data and simulation results.
See more of this Group/Topical: Computing and Systems Technology Division