Using recurrent neural networks to detect changes in autocorrelated processes for quality monitoring

19
Using recurrent neural networks to detect changes in autocorrelated processes for quality monitoring Massimo Pacella a, * ,1 , Quirico Semeraro b,2 a Dipartimento di Ingegneria dell’Innovazione, Universita’ degli Studi di Lecce, Via per Monteroni, 73100 Lecce, Italy b Dipartimento di Meccanica, Politecnico di Milano, Piazza Leonardo da Vinci 32, 20133 Milano, Italy Received 17 January 2006; received in revised form 19 December 2006; accepted 23 March 2007 Available online 30 March 2007 Abstract With the growing of automation in manufacturing, process quality characteristics are being measured at higher rates and data are more likely to be autocorrelated. A widely used approach for statistical process monitoring in the case of autocorrelated data is the residual chart. This chart requires that a suitable model has been identified for the time series of process observations before residuals can be obtained. In this work, a new neural-based procedure, which is alleviated from the need for building a time series model, is introduced for quality control in the case of serially correlated data. In particular, the Elman’s recurrent neural network is proposed for manufacturing process quality control. Performance com- parisons between the neural-based algorithm and several control charts are also presented in the paper in order to validate the approach. Different magnitudes of the process mean shift, under the presence of various levels of autocorrelation, are considered. The simulation results indicate that the neural-based procedure may perform better than other control charting schemes in several instances for both small and large shifts. Given the simplicity of the proposed neural network and its adaptability, this approach is proved from simulation experiments to be a feasible alternative for quality monitoring in the case of autocorrelated process data. Ó 2007 Elsevier Ltd. All rights reserved. Keywords: Manufacturing; Quality monitoring; ARMA models; Recurrent neural network 1. Introduction One of the major techniques of Statistical Process Control (SPC) is the control chart introduced by Shew- hart. In its basics form, a control chart compares process observations (or a function of such observations) to a pair of control limits. Two fundamental assumptions for the development of a control chart are: (1) the dis- tribution function underlying process data is normal and (2) process data are independently distributed. 0360-8352/$ - see front matter Ó 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.cie.2007.03.003 * Corresponding author. Tel.: +39 0 832297253; fax: +39 0 832297279. E-mail addresses: [email protected] (M. Pacella), [email protected] (Q. Semeraro). 1 Dr. Pacella is an Assistant Professor in the Department of ‘‘Ingegneria dell’Innovazione’’ at the Universita’ degli Studi di Lecce, Italy. 2 Dr. Semeraro is a Professor in the Department of Mechanical Engineering and Dean of the Faculty of ‘‘Ingegneria Industriale’’ at the Politecnico di Milano, Italy. He is a Member of ASQ. Computers & Industrial Engineering 52 (2007) 502–520 www.elsevier.com/locate/dsw

Transcript of Using recurrent neural networks to detect changes in autocorrelated processes for quality monitoring

Computers & Industrial Engineering 52 (2007) 502–520

www.elsevier.com/locate/dsw

Using recurrent neural networks to detect changes inautocorrelated processes for quality monitoring

Massimo Pacella a,*,1, Quirico Semeraro b,2

a Dipartimento di Ingegneria dell’Innovazione, Universita’ degli Studi di Lecce, Via per Monteroni, 73100 Lecce, Italyb Dipartimento di Meccanica, Politecnico di Milano, Piazza Leonardo da Vinci 32, 20133 Milano, Italy

Received 17 January 2006; received in revised form 19 December 2006; accepted 23 March 2007Available online 30 March 2007

Abstract

With the growing of automation in manufacturing, process quality characteristics are being measured at higher ratesand data are more likely to be autocorrelated. A widely used approach for statistical process monitoring in the case ofautocorrelated data is the residual chart. This chart requires that a suitable model has been identified for the time seriesof process observations before residuals can be obtained. In this work, a new neural-based procedure, which is alleviatedfrom the need for building a time series model, is introduced for quality control in the case of serially correlated data. Inparticular, the Elman’s recurrent neural network is proposed for manufacturing process quality control. Performance com-parisons between the neural-based algorithm and several control charts are also presented in the paper in order to validatethe approach. Different magnitudes of the process mean shift, under the presence of various levels of autocorrelation, areconsidered. The simulation results indicate that the neural-based procedure may perform better than other control chartingschemes in several instances for both small and large shifts. Given the simplicity of the proposed neural network and itsadaptability, this approach is proved from simulation experiments to be a feasible alternative for quality monitoring in thecase of autocorrelated process data.� 2007 Elsevier Ltd. All rights reserved.

Keywords: Manufacturing; Quality monitoring; ARMA models; Recurrent neural network

1. Introduction

One of the major techniques of Statistical Process Control (SPC) is the control chart introduced by Shew-hart. In its basics form, a control chart compares process observations (or a function of such observations) toa pair of control limits. Two fundamental assumptions for the development of a control chart are: (1) the dis-tribution function underlying process data is normal and (2) process data are independently distributed.

0360-8352/$ - see front matter � 2007 Elsevier Ltd. All rights reserved.doi:10.1016/j.cie.2007.03.003

* Corresponding author. Tel.: +39 0 832297253; fax: +39 0 832297279.E-mail addresses: [email protected] (M. Pacella), [email protected] (Q. Semeraro).

1 Dr. Pacella is an Assistant Professor in the Department of ‘‘Ingegneria dell’Innovazione’’ at the Universita’ degli Studi di Lecce, Italy.2 Dr. Semeraro is a Professor in the Department of Mechanical Engineering and Dean of the Faculty of ‘‘Ingegneria Industriale’’ at the

Politecnico di Milano, Italy. He is a Member of ASQ.

M. Pacella, Q. Semeraro / Computers & Industrial Engineering 52 (2007) 502–520 503

The most frequently reported effect on control charts of violating such assumptions is the erroneous assign-ment of the control limits. Alwan and Roberts (1995) considered a sample of 235 control chart applicationsand showed that about 85% displayed incorrect control limits. More than half of these displacements were dueto violation of the independence assumption. Misplacement of control limits was due to serial correlation (i.e.,autocorrelation) in the data.

Autocorrelated data are quite prevalent in process industries (Montgomery, 2000). Also in manufacturing,process data may present various types of temporal dependencies. Significant examples can be found in forg-ing operations or extruding processes (Wardell, Moskowitz, & Plante, 1994). When data exhibits autocorre-lation, control methods, which allow for violations of the independence assumption, must be used.

A common approach is to filter out autocorrelation by an autoregressive integrated moving average(ARIMA) model, following the techniques of Box et al. (Box, Jenkins, & Reinsel, 1994). If the time seriesmodel is accurate enough, the residuals (i.e., the prediction errors) are statistically uncorrelated to each other,and common control charts can be applied to them. However, time series modeling may be often awkward inactual applications (Zhang, 1998; Jiang, Tsui, & Woodall, 2000). For example, Stone and Taylor (1995)reported a few of industrial processes that exhibit temporal dependencies in their natural output, which werenot adequately handled by the standard time series models.

In this work, autocorrelated data obtained from stationary processes are considered. Stationary processesare more commonly encountered in the manufacturing environment (Wardell et al., 1994; Zhang, 1998; Jianget al., 2000). For statistical process monitoring, a stationary process, which has constant mean and constantvariance, may undergo shifts in the mean and such shifts are to be detected as quickly as possible. The firstorder autoregressive first order moving average ARMA(1,1) stationary model has been considered in the pres-ent research as reference test case (Box et al., 1994).

Witnessing the increasing capability of artificial neural networks (NNs) in modeling real systems, there is agreat interest in NNs for quality monitoring. Among various models of NN, recurrent neural networks, exem-plified by the Elman’s NN (ENN), have been shown to be useful for time series modeling. The ENN employsfeedback connections and addresses the temporal relationship of its inputs by maintaining an internal state(Elman, 1990). In this paper, the ENN is investigated for the problem of process monitoring in the case ofautocorrelated data. Performance comparisons between the neural-based algorithm and several control chartsare also presented in order to validate the approach. Different magnitudes of mean shift are considered underthe presence of various levels of autocorrelation.

The paper is structured as follows. In Section 2, statistics-based control charts for autocorrelated data arereviewed. A few of neural-based control schemes are also analyzed. In Section 3, the stationary ARMA(1,1)model, which is used as reference throughout the paper, is discussed. In Section 4, the ENN is presented, whilein subsequent Section 5, the ENN training is discussed in the case of autocorrelated data. In Section 6, theperformance are analyzed and compared to those of the major statistics-based control charts for autocorre-lated data. Section 7 concludes the paper with a summary and remarks.

2. Review of control techniques

2.1. Statistics-based charting techniques

Three statistics-based charting techniques may be adopted to handle autocorrelation:

1. Time series model to fit the autocorrelated data combined to common Shewhart’s control chart to monitorthe residuals.

2. Control chart with adjusted control limits, which account for the autocorrelation of observations, to mon-itor process data.

3. Control chart of specialized statistics of the autocorrelated data.

The first technique was proposed by Alwan and Roberts (1988), which presented the use of time series mod-eling to signal assignable causes of variation by using the special-cause control (SCC) chart. The SCC chart isthe Shewhart’s chart of the residuals. The basic idea of the SCC chart is that if the common cause of the

504 M. Pacella, Q. Semeraro / Computers & Industrial Engineering 52 (2007) 502–520

process variation can be modeled by an adequate autocorrelation structure, then the residuals of the fittedmodel will explain the special cause to the process, if any.

A practical limitation of SCC chart is that it requires one to have some skill in time series analysis (Boxet al., 1994). Even if the natural autocorrelation of the process has been properly modeled, however, theSCC chart could not have a satisfactory capability in signaling the departures from natural variation thatmay be attributed to assignable causes. On the one hand, a residual chart will have poor properties in signalingmean shifts when the time series is positively autocorrelated, even when the autocorrelation is only moderate.On the other hand, a residual chart will have good properties in signaling mean shift when the process is neg-atively autocorrelated. This result was demonstrated by Ryan (1991) for a first-order autoregressive process.The reason behind this result is that, when the process is positively (negatively) autocorrelated and the shift inprocess mean occurs, the one-step-ahead forecast moves in the same (opposite) direction of the shift. Thiscauses the residuals to be very small (large), and hence the shift is detected later (earlier).

In the second category of charting techniques for autocorrelated processes, Wardell et al. (1994) suggestedthat k-sigma limits on a common Shewhart’s control chart could be used with k slightly different from 3 so asto adjust for the autocorrelation. Zhang (1997) used simulation to find out the best choice of k for certain timeseries models so as to make the in-control performance as close as possible to the normal-theory value of acontrol chart for independent data.

In the third category, the exponential weighed moving average (EWMA) statistic applied on the originalautocorrelated observations have been recommended by several authors. Zhang (1998) developed a chart usinga EWMA statistic to monitor a stationary (ST) process: the EWMAST chart. This chart is superior to the SCCand Shewhart’s charts when the process autocorrelation is not very strong and the mean change is not large.Jiang et al. (2000) proposed a generalized charting method based on the autoregressive integrated moving aver-age (ARMA) statistic to monitor a ST process, the ARMAST chart. Although explicit model fitting is notrequired, the use of ARMAST chart still requires parameter estimations. Selecting proper charting parameterscan be cumbersome, and it becomes more complicated when higher orders autoregressive model are considered.

Wardell et al. (1994) applied three statistics-based charting techniques (i.e., SCC, EWMA and Shewhart’scharts) for monitoring autocorrelated processes. By comparing the performances of the SCC chart to those ofthe Shewhart’s chart and EWMA chart, they concluded there is no a unique best chart to use for every type ofautocorrelated process.

2.2. NN-based control techniques

A neural network is an approach to data processing that does not require model or rule development. Thethree major features of an NN are the processing elements (the so-called neurons), the connections betweenneurons, and the training algorithm used to find values of the network parameters. NNs are extensivelyexploited in the automation of SPC implementation (Zorriassantine & Tannock, 1998). The application ofNNs to SPC can be commonly classified into two categories: (1) control chart pattern recognition and (2)detection of unnatural behavior.

Pattern recognition provides a mechanism for identifying different types of predefined patterns in real timeon the series of process quality measurements. The recognized patterns then serve as the primary informationfor identifying the causes of unnatural process behavior. Hwarng and Hubele (1993) reported the first appli-cation of an NN for detecting unnatural patterns on control chart. Guh and Tannock (1999) tried to inves-tigate the feasibility of an NN to recognize concurrent control chart patterns (where more than one patternexists together, which may be associated with different causes). Guh and Hsieh (1999) presented a control sys-tem composed of several interconnected NNs both to recognize the unnatural control chart patterns and toestimate their parameters. NN approaches to control chart pattern recognition have been reported to haveindustrial applications. Jang, Yang, and Kang (2003) developed a monitoring and diagnostic system usingan NN that can automatically detect control chart patterns and rapidly classify the corrective action to a par-ticular automotive assembly process.

In the other category, detection of unnatural behavior, Pugh (1991) reported the first application of anNN for signaling shifts in the mean of a manufacturing process. More extensive studies on mean shift detec-tion by a neural-based approach were also published subsequently (Chang & Aw, 1996; Cheng & Cheng,

M. Pacella, Q. Semeraro / Computers & Industrial Engineering 52 (2007) 502–520 505

2001). Al-Ghanim (1997) proposed an unsupervised neural-based system that is capable to detect any unnat-ural change in the behavior of a manufacturing process. Recent researches extend Al-Ghanim’s methodologyand present outperforming approaches for monitoring unnatural behaviors (Pacella, Semeraro, & Anglani,2004a, 2004b).

There have been also researches that extended the study to monitoring autocorrelated processes. Cook andChiu (1998) used NNs to classify data generated from a first order autoregressive model. Chiu, Chen, and Lee(2001) proposed an NN to signal mean shifts in a first order autoregressive process. Similarly, Cook, Zobel,and Nottingham (2001) adopted an NN to signal variance shifts for manufacturing processes with correlatedprocess parameters. More recently, Hwarng (2004) presented an NN-based methodology for monitoring pro-cess shift in the presence of autocorrelation. The comparative study on first order autoregressive processesshowed that this neural-based scheme outperforms statistical-based control charts. The author further pro-vided a neural-based identification system for both mean shift and correlation parameter change in a firstorder autoregressive process (Hwarng, 2005).

In this paper, a recurrent ENN is investigated for monitoring process shifts in the presence of autocor-relation. A recurrent ENN, where the ‘recurrency’ allows the network to remember cues from the recentpast, is potentially a suitable tool for quality monitoring in the case of autocorrelated processes. The sim-plest multilayered architecture for the ENN, which is easily implemented and trained, is analyzed through-out this paper.

3. Process model

In this section, the stationary time series model, which is used as reference test case throughout the study, isbriefly presented. Let {Xt}t = 1, 2, . . . be the random time series of the quality characteristic measurements. Sup-pose Xt as follows.

X t ¼ Zt þ s; ð1Þ

where t is an index of time, {Zt}t = 1, 2, . . . is a time series of natural deviations (assumed, without loss of gen-erality, with zero mean), and s is a shift due to special causes. The aim of quality monitoring is to test the nullhypothesis H0: s = 0 (in-control state of the process) against the alternative hypothesis H1:s 5 0 (out-of-con-trol state of the process).

The Box–Jenkins ARMA model (Box et al., 1994) may incorporate both autoregressive terms (AR) andmoving average terms (MA). The AR terms express a time series as a linear function of its past values plusa noise term. The general class of ARMA(p,q) models is based on the number p of past values combinedto the present one in the model, and the number q of random shocks in the moving average component ofthe model. Among these models, the model for first-order autocorrelation and first-order moving averagemodel, ARMA(1, 1), is defined as follows.

Zt ¼ /Zt�1 � het�1 þ et; et � NIDð0; reÞ; ð2Þ

where / is the AR coefficient and h is the MA coefficient. The random error terms et are assumed indepen-dently and normally distributed (NID) with mean 0 and common variance r2

e (Gaussian white noise). Theautocovariances of the observations are dependent on the model parameters and they can be written as follows(Box et al., 1994).

c0 ¼ r2e

1þ h2 � 2/h

1� /2; c1 ¼ c0/� r2

e h; ci ¼ ci�1/ i > 1: ð3Þ

Without loss of generality, it is assumed that r2e is as follows:

r2e ¼

1� /2

1þ h2 � 2/h: ð4Þ

Therefore, the mean of the stationary time series {Xt}t = 1, 2, . . . is equal to s and its variance is 1.

506 M. Pacella, Q. Semeraro / Computers & Industrial Engineering 52 (2007) 502–520

4. Feedforward and recurrent NNs

Time series modeling and monitoring is one of the most valuable tasks for statistical process control. Whenthe underlying phenomenon generating observed data is not known, solutions may be provided by neural-based approaches. In this section, such approaches are briefly discussed.

The architecture of an NN can be classified as either feedforward or recurrent, with respect to the directionof its connections. The connections between the neurons in a feedforward network are such that the informa-tion can flow in one direction only: from input neurons to output neurons. In a recurrent network, two typesof connections, the feedforward and feedback, allow information to propagate in two directions, from inputneurons to output neurons and vice versa.

Several factors affect the performance of a feedforward NN for time series modeling and monitoring (Barg-hash & Santarisi, 2004). The most important ones are: (1) training data set, (2) window size and (3) size of thenetwork.

The implementation of the training data set may be quite difficult, as many magnitudes of process changesshould be considered simultaneously, i.e., the network should be trained across a wide range of possible pro-cess changes. A feedforward NN trained at a single magnitude might not work well for other magnitudes ofchange.

The second issue concerns the window size. Since a feedforward network can perform only static mappingsbetween input and output, when it is used for time series monitoring, it is necessary to prepare data in a specialway. An input sample for the network consists of a window containing a chosen number m of successive valuesof the time series. The number m of data in a sequence provided to the NN is usually referred to as the windowsize. Since rapid computation is of primary importance for process control, it is required to minimize the win-dow size in order to obtain an efficient computation. However, a too small window size might cause a highermodeling error due to insufficient information to represent the features of the data.

The third issue is related to the size of the network, i.e., the number of nodes in each level. The topology ofthe NN directly affects two of the most important factors of its training: generalization and computationaltime. In general, a large network requires more computational time than a smaller one. A smaller networkmay be more desirable because of model understanding. Moreover, using a network with a smaller numberof neurons usually results in better generalization capabilities.

The time series modeling capacity of a feedforward network may be improved if the relationships between dif-ferent observations of the time series at different lags can be incorporated directly into the learning process.Recurrent networks may be useful for time series modeling because they have the internal states of the network.These states work as a short-term memory and they are able to represent information about the preceding inputs.

Recurrent NNs (also known as Elman-Jordan networks) are multilayered networks augemented with oneor more additional context units storing output values of the network layers. These values are then used foractivating the same layer, or different layers, in the next time step. The Jordan’s network has context units,which store delayed output values and present these as additional inputs to the network, while the Elman’sneural network (ENN) has connections from a hidden layer to itself (Elman, 1990). The context units feedagain into all of the hidden layer units of the network. A comprehensive review of recurrent neural networkscan be found in Tsoi and Back (1997).

The use of internal states into the non-linear model of the NN is significant for a number of reasons: thetraining becomes more effective, the size of the network can be reduced, it is not necessary to organize the timeseries data by a window of appropriate size. This not only gives the user confidence in the monitoring systembut also facilitates the use of the model and can lead to a deeper understanding of the process.

5. ENN implementation

5.1. Training

Similar to a feedforward network, the strength of all connections between neurons in an ENN are indicatedwith weights. Initially, all weight values are chosen randomly and then they are optimized during the stage oftraining.

M. Pacella, Q. Semeraro / Computers & Industrial Engineering 52 (2007) 502–520 507

The ENN can be trained with the gradient descent back-propagation optimization method, similar toa conventional feedforward NN (Pham & Liu, 1996). Although the error-gradient information computedusing the back-propagation algorithm has been shown to be an effective and efficient tool for learningcomplex functions using a feedforward NN, the same does not occur with an ENN (Cohen, Saad, &Marom, 1997). Training is known to be difficult because using a gradient descent algorithm may havetwo kinds of problems when applied to an ENN. First, the computational effort required by the trainingalgorithm may be inflated when the number of context neurons is high. Second, the algorithm is notguaranteed to find the global minimum of the error function since gradient descent may be stuck inlocal minima, where it may remain indefinitely. There have been proposed algorithms that extend theback-propagation method to ENNs (Blanco, Delgado, & Pegalajar, 2001), but the optimal training ofa recurrent NN using conventional gradient-descent methods is still a complicated task.

An alternative way in order to obtain an effective and efficient training of the ENN by using the back-prop-agation algorithm is reducing the size of the network and the range of its input values. For this reason, thesimplest network structure (i.e., single input node, single context neuron and single output node) has beenimplemented in this work.

The implemented ENN allows for a window size of one (m = 1) and implies that the resulting controlschema can signal even if a single observation is collected. Although sample observations are natural candi-dates for neural network input, different functions could be used to reduce the range of input space. The abso-lute value of the current process outcome at time of index t has been implemented. The absolute value functionreduces the range of possible input values, preserving at the same time the essential information to the ENNfor monitoring process shifts.

This ENN was trained in the conventional supervised manner by minimizing the mean squared error func-tion (MSE) with respect to the network weights. Both in-control and out-of-control process data were usedduring training. The back-propagation algorithm, with momentum and adaptive learning rate (whichattempted to keep the learning-rate step size as large as possible while keeping learning stable), was used toadjust the weights of the ENN (Haykin, 1994).

The mathematical model of the proposed ENN can be summarized as follows, where xt and yt are the ENNinput and output value, respectively.

yt ¼ b2 þ w3zt

zt ¼ tanh½b1 þ w1xt þ w2zt�1�;

�ð5Þ

b1 and b2 are the biases of the hidden and output layer, respectively; w1 is the weight from the input to hiddenlayer; w2 is the recurrent weight from the hidden output layer; w3 is the weight from the hidden to output layer.The activation function adopted in this work is as follows:

tanhðxÞ ¼ 1� e2x

1þ e2x: ð6Þ

5.2. Tuning (choice of cutoff parameter C)

The desired output of neural network is either 0 if no change has been detected, 1 otherwise. Due to therandom noise and to different values of actual inputs the ENN output is a number ranging approximatelybetween 0 and 1. Therefore, an activation cutoff value must be defined (if the network output is greater thanthe cutoff, then an alarm is released).

The goal of tuning is to select a value of the cutoff in order to maintain in-control performance of the ENNabout equal to a predefined value. This serves to provide an unbiased comparison of the proposed techniqueto any other technique when the process drifts to unnatural states. In particular, during the tuning phase,learning is disengaged (i.e., no more weight adaptations are allowed) and data from a tuning series are pre-sented to the NN in order to check the performance of different settings of the cutoff. The tuning sequenceis obtained either by a series of real process data (measurements of the quality parameter of interest when onlynatural causes of variation are in effect) or a series of simulated data.

Table 1Weights (w1,w2,w3), biases (b1,b2) and output cut-off values (C) of the trained ENN for 25 ARMA(1,1) models

h// .95 .475 0 �.475 �.95

.9 b1 �0.3629 b1 �1.2675 b1 �0.0486 b1 0.0670 b1 �0.9163b2 0.4937 b2 0.4995 b2 0.4362 b2 0.4663 b2 0.5160w1 0.2369 w1 0.7567 w1 0.0325 w1 �0.0416 w1 0.4925w2 1.6915 w2 2.2607 w2 0.9809 w2 0.9189 w2 0.9173w3 0.5294 w3 0.5058 w3 1.2139 w3 �1.3251 w3 0.5526

C 0.1442 C 0.1995 C 0.1578 C 0.1422 C 0.0230

.45 b1 0.0131 b1 0.0200 b1 0.4697 b1 �0.4685 b1 0.0387b2 0.0646 b2 0.1224 b2 0.4935 b2 0.4872 b2 0.0288w1 �0.0148 w1 �0.0170 w1 �0.5737 w1 0.5529 w1 �0.1616w2 0.9942 w2 1.0035 w2 2.1966 w2 1.7116 w2 0.5250w3 �1.7869 w3 �1.6224 w3 �0.5091 w3 0.5184 w3 �1.4689

C 0.6030 C 0.1907 C 0.2436 C 0.1753 C 0.1539

0 b1 �0.1033 b1 �0.0174 b1 1.0319 b1 0.8336 b1 �1.2903b2 0.4982 b2 �0.0118 b2 0.5040 b2 0.4974 b2 0.4644w1 0.2794 w1 0.0186 w1 �0.5813 w1 �0.6996 w1 0.9689w2 2.4979 w2 1.0053 w2 2.2433 w2 2.1331 w2 0.1073w3 0.5086 w3 1.7388 w3 �0.5164 w3 �0.5055 w3 0.5771

C 0.1169 C 0.6084 C 0.2224 C 0.1936 C 0.2176

�.45 b1 0.0110 b1 0.0158 b1 0.0188 b1 0.9342 b1 0.0297b2 �0.0097 b2 0.0493 b2 0.0884 b2 0.5115 b2 0.0779w1 �0.0133 w1 �0.0155 w1 �0.0186 w1 �0.5134 w1 �0.0493w2 1.0074 w2 1.0087 w2 1.0091 w2 2.1288 w2 0.8922w3 �1.8052 w3 �1.7446 w3 �1.5887 w3 �0.5173 w3 �1.7027

C 0.5392 C 0.4515 C 0.6299 C 0.1960 C 0.1829

�.9 b1 �0.8374 b1 �1.0467 b1 �1.0938 b1 �0.0246 b1 �0.8125b2 0.5082 b2 0.5040 b2 0.5062 b2 �0.1921 b2 0.5085w1 0.4615 w1 0.6036 w1 0.6278 w1 0.0224 w1 0.5063w2 2.2544 w2 2.8191 w2 2.2602 w2 1.0191 w2 2.1150w3 0.5176 w3 0.5055 w3 0.5186 w3 1.9503 w3 0.5188

C 0.5608 C 0.3470 C 1.0119 C 0.1478 C 0.2525

Each cut-off C was estimated by tuning in order to obtain an in-control ARL value about equal to 370.

508 M. Pacella, Q. Semeraro / Computers & Industrial Engineering 52 (2007) 502–520

It is worth to note that there exists a monotonically decreasing relation between the cutoff value and thealarm rate (higher values, cause longer run lengths while lower cutoff values, cause shorter run lengths). Inthis work, a computer program was implemented in order to estimate iteratively the in-control alarm rateof a trained ENN for several values of the cutoff. Given a set of natural data, a binary search method, whichiteratively divides the space of admissible values for the cutoff in sub-intervals of half-length, was exploited inorder to find an appropriate value of cutoff that allows for a specific in-control performance.

6. Experimental results

In this section, the proposed ENN is analyzed and its performance is compared to that of statistics-basedcontrol charts for autocorrelated data. Several levels of autocorrelation are considered, represented by differ-ent values of / and h in the ARMA(1, 1) model of Eq. (2). The experimental design reported by Wardell et al.(1994), in which values for the two ARMA coefficients were selected so as to completely cover the regionover which the time series is stationary, was also implemented in this work (in particular,

Table 2ENN simulation results for 25 ARMA(1,1) models

h// .95 .475 0 �.475 �.95

Shift ARL 95% Conf. Int. Shift ARL 95% Conf. Int. Shift ARL 95% Conf. Int. Shift ARL 95% Conf. Int. Shift ARL 95% Conf. Int.

.9 0.0 375.38 [368.18,382.58] 0.0 371.06 [363.22,378.89] 0.0 371.51 [364.05,378.96] 0.0 375.38 [367.38,383.37] 0.0 370.02 [362.36,377.69]0.5 86.18 [84.48,87.88] 0.5 55.68 [54.70,56.65] 0.5 27.40 [27.03,27.77] 0.5 10.43 [10.34,10.51] 0.5 1.83 [1.82,1.83]1.0 23.94 [23.44,24.43] 1.0 8.82 [8.68,8.96] 1.0 5.12 [5.09,5.15] 1.0 3.23 [3.21,3.24] 1.0 1.08 [1.08,1.09]1.5 8.58 [8.42,8.74] 1.5 3.54 [3.51,3.57] 1.5 3.19 [3.17, 3.20] 1.5 2.27 [2.26,2.28] 1.5 1.00 [1.00,1.00]2.0 4.21 [4.13,4.30] 2.0 2.42 [2.40,2.43] 2.0 2.48 [2.47, 2.49] 2.0 1.91 [1.91,1.92] 2.0 1.00 [1.00,1.00]2.5 2.62 [2.58,2.66] 2.5 1.96 [1.95,1.97] 2.5 2.11 [2.11, 2.12] 2.5 1.79 [1.78,1.80] 2.5 1.00 [1.00,1.00]3.0 2.05 [2.03,2.07] 3.0 1.73 [1.73,1.74] 3.0 1.93 [1.92, 1.94] 3.0 1.62 [1.61,1.63] 3.0 1.00 [1.00,1.00]

.45 0.0 375.54 [368.42,382.65] 0.0 369.96 [361.66,378.25] 0.0 373.28 [366.83,379.73] 0.0 369.84 [363.66,376.02] 0.0 367.73 [361.71,373.75]0.5 172.93 [168.45,177.41] 0.5 44.56 [43.92,45.20] 0.5 42.04 [41.23,42.86] 0.5 17.55 [17.28,17.83] 0.5 1.96 [1.95,1.97]1.0 75.49 [73.94,77.05] 1.0 16.38 [16.18,16.58] 1.0 6.34 [6.25,6.43] 1.0 3.04 [3.02,3.06] 1.0 1.24 [1.23,1.25]1.5 37.42 [36.65,38.19] 1.5 9.54 [9.43,9.65] 1.5 3.04 [3.01,3.07] 1.5 2.03 [2.02,2.04] 1.5 1.00 [1.00,1.00]2.0 21.94 [21.62,22.25] 2.0 6.54 [6.46,6.63] 2.0 2.20 [2.19,2.22] 2.0 1.72 [1.71,1.73] 2.0 1.00 [1.00,1.00]2.5 14.17 [13.94,14.40] 2.5 4.96 [4.90,5.01] 2.5 1.88 [1.87,1.89] 2.5 1.48 [1.47,1.49] 2.5 1.00 [1.00,1.00]3.0 10.18 [10.03,10.33] 3.0 3.98 [3.94,4.02] 3.0 1.69 [1.68,1.70] 3.0 1.25 [1.24,1.26] 3.0 1.00 [1.00,1.00]

0 0.0 374.52 [366.98,382.07] 0.0 370.93 [363.27,378.60] 0.0 371.87 [365.00,378.74] 0.0 373.08 [366.73,379.44] 0.0 369.45 [360.42,378.48]0.5 193.95 [189.68,198.23] 0.5 64.29 [63.16,65.42] 0.5 82.53 [80.85,84.22] 0.5 37.17 [36.40,37.95] 0.5 2.59 [2.57, 2.61]1.0 88.65 [86.92,90.37] 1.0 22.38 [22.04,22.72] 1.0 20.26 [19.89,20.62] 1.0 6.17 [6.08,6.25] 1.0 1.43 [1.42,1.44]1.5 42.29 [41.35,43.24] 1.5 11.99 [11.82,12.16] 1.5 7.41 [7.29,7.53] 1.5 2.84 [2.81,2.87] 1.5 1.04 [1.04,1.04]2.0 20.64 [20.17,21.11] 2.0 7.88 [7.78,7.99] 2.0 3.77 [3.72,3.82] 2.0 2.05 [2.04,2.07] 2.0 1.00 [1.00,1.00]2.5 10.14 [9.85,10.44] 2.5 5.76 [5.68,5.84] 2.5 2.47 [2.45,2.50] 2.5 1.77 [1.76,1.78] 2.5 1.00 [1.00,1.00]3.0 5.28 [5.13,5.43] 3.0 4.54 [4.49,4.59] 3.0 1.94 [1.92, 1.95] 3.0 1.56 [1.55,1.57] 3.0 1.00 [1.00,1.00]

�.45 0.0 369.98 [363.99,375.98] 0.0 370.60 [363.19,378.00] 0.0 372.16 [365.24,379.08] 0.0 370.60 [363.19,378.00] 0.0 372.51 [365.93,379.08]0.5 189.52 [185.84,193.21] 0.5 71.30 [70.29,72.31] 0.5 55.51 [54.78,56.24] 0.5 71.30 [70.29,72.31] 0.5 7.54 [7.46, 7.62]1.0 86.91 [85.39,88.42] 1.0 29.60 [29.22,29.97] 1.0 21.00 [20.75,21.26] 1.0 29.60 [29.22,29.97] 1.0 2.98 [2.96,3.01]1.5 45.55 [44.66,46.44] 1.5 16.99 [16.78,17.20] 1.5 11.95 [11.81,12.09] 1.5 16.99 [16.78,17.20] 1.5 2.13 [2.12,2.14]2.0 28.06 [27.51,28.61] 2.0 11.74 [11.63,11.85] 2.0 8.03 [7.92,8.14] 2.0 11.74 [11.63,11.85] 2.0 1.86 [1.85,1.87]2.5 18.91 [18.58,19.23] 2.5 8.84 [8.75,8.94] 2.5 6.10 [6.03,6.18] 2.5 8.84 [8.75,8.94] 2.5 1.69 [1.67,1.70]3.0 14.07 [13.86,14.28] 3.0 7.05 [6.98,7.12] 3.0 4.86 [4.79,4.92] 3.0 7.05 [6.98,7.12] 3.0 1.48 [1.47,1.49]

�.9 0.0 368.03 [359.86,376.20] 0.0 371.78 [362.71,380.85] 0.0 377.29 [367.85,386.73] 0.0 372.41 [364.10,380.72] 0.0 368.57 [361.80,375.33]0.5 209.67 [205.49,213.85] 0.5 98.38 [96.50,100.26] 0.5 79.15 [77.50,80.80] 0.5 48.21 [47.52,48.90] 0.5 90.27 [88.56,91.98]1.0 97.86 [96.05,99.66] 1.0 33.30 [32.59,34.00] 1.0 25.79 [25.30,26.28] 1.0 20.01 [19.78,20.25] 1.0 21.54 [21.07,22.00]1.5 46.71 [45.61,47.81] 1.5 14.88 [14.61,15.14] 1.5 12.14 [11.97,12.31] 1.5 12.12 [11.97,12.28] 1.5 7.47 [7.35,7.58]2.0 23.80 [23.21,24.39] 2.0 7.86 [7.71,8.01] 2.0 6.93 [6.84,7.02] 2.0 8.54 [8.46,8.62] 2.0 3.69 [3.64,3.74]2.5 12.23 [11.91,12.55] 2.5 4.73 [4.65,4.80] 2.5 4.84 [4.78,4.90] 2.5 6.54 [6.48,6.60] 2.5 2.45 [2.42,2.47]3.0 6.88 [6.72,7.04] 3.0 3.29 [3.25,3.32] 3.0 3.71 [3.67, 3.75] 3.0 5.31 [5.27,5.36] 3.0 1.93 [1.92,1.95]

ARL values and 95% confidence interval of ARL (simulation results).

M.

Pa

cella,

Q.

Sem

eraro

/C

om

pu

ters&

Ind

ustria

lE

ng

ineerin

g5

2(

20

07

)5

02

–5

20

509

0 0.5 1 1.5 2 2.5 310

0

101

102

103

Shift (unit of standard deviation)

Ave

rage

Run

Len

gth

(AR

L)

ENN

SCCX

EWMAST

Fig. 1. Uncorrelated process. ARL curves vs. shift of the mean. ENN (bold line). SCC chart, X chart, EWMAST and ARMAST charts(Table 4 of Wardell et al., 1994; Table 5 of Zhang, 1998 and Table 4 of Jiang et al., 2000).

510 M. Pacella, Q. Semeraro / Computers & Industrial Engineering 52 (2007) 502–520

/ 2 {.95, .475, .0,�.475,�.95} and h 2 {.9, .45, .0,�.45,�.9}). For each combination of parameters / and h,an individual ENN was trained.

Based on preliminary investigation, no evident improvement in performance was attained by extending thetraining set beyond 1000 examples. Therefore, each training set consisted of 500 examples of non-shifted dataand 500 of shifted data (s = 3.5) simulated by the model of Eq. (2). The input values of the ENN were mappedinto the range [0,1] by the absolute value function. During training, the adaptive gradient rule was imple-mented. The CPU time required for network training averaged 20 min on a 1300-MHz machine. The trainingcontinued until the goal MSE < 0.001 for the training error was reached, where the number of epochs neededto reach such a goal differs from model to model. Given the simple structure of the networks, with just fiveparameters (i.e., three weights and two bias), no overfitting of the 1000 data set was observed during trainingin any of the implemented ENNs.

The desired output value was assigned to indicate the presence of a shift in the mean, with 0 indicating noshift and 1 indicating a large shift. In this work, performance evaluation and comparison are based on theaverage run length (ARL). Run length is defined as the number of observations needed until an out-of-controlsignal is released (Montgomery, 2000). An appropriate value for the cutoff was selected for each of the ENNsin order to obtain an actual in-control ARL statistically equal to a reference value (the in-control ARL of 370was used as a reference for performance evaluation throughout this study). For each ENN, Table 1 summa-rizes the weights and biases. In the same table, the values of cutoff (labeled as ‘C’) are also reported.

6.1. Testing phase

Computer simulation was implemented in order to estimate the out-of-control ARL values of the proposedapproach. Testing data were generated considering several levels of shift in the process mean. For each valueof the shift, 10,000 independent runs were simulated. Each run consisted of a stream of subsequent processoutcomes needed until an out-of-control signal was released by the ENN. Hence, each ARL value is the meanof 10,000 run lengths obtained from simulation.

In computing each run length, the process was simulated in-control from start-up for a period-length of 300observations. In fact, a process may remain in-control for a period, and then experience a change at time index(say s) where one or more assignable causes are introduced to the process between sampled observations s ands + 1. Any alarm occasionally released by the ENN during the first 300 observations of each run was

0 0.5 1 1.5 2 2.5 310

0

101

102

103

ENN

SCCX

EWMAST

0 0.5 1 1.5 2 2.5 310

0

101

102

103

ENN

SCCX

EWMAST

0 0.5 1 1.5 2 2.5 310

0

101

102

103

Ave

rage

Run

Len

gth

(AR

L)

Ave

rage

Run

Len

gth

(AR

L)A

vera

ge R

un L

engt

h (A

RL)

Ave

rage

Run

Len

gth

(AR

L)

ENN

SCC

XEWMAST

ARMAST

0 0.5 1 1.5 2 2.5 310

0

101

102

103

(

ENN

SCCX

EWMAST

a

c d

b

Shift (unit of standard deviation) Shift (unit of standard deviation)

Shift (unit of standard deviation)Shift (unit of standard deviation)

Fig. 2. ARMA(1,1) / > 0, h > 0. ARL curves vs. shift of the mean. ENN (bold line). SCC chart, X chart, EWMAST and ARMASTcharts (Table 4 of Wardell et al., 1994; Table 5 of Zhang, 1998 and Table 4 of Jiang et al., 2000). (a) (/,h) = (�.95, .9), (b) (/,h) =(�.475, .9), (c) (/,h) = (�.95, .45), (d) (/,h) = (�.475, .45).

M. Pacella, Q. Semeraro / Computers & Industrial Engineering 52 (2007) 502–520 511

neglected. From our experience, a value of observations greater than 300 does not have any influence on theNN performance, on the contrary, a value smaller than 300 can reduce significantly the average run lengths.

The simulation error is described by the confidence intervals of the ARL. The method of batch means(where each batch contained 200 simulation results) was implemented in order to estimate the confidenceintervals. Each interval was based on the t-based statistic with 199 degrees of freedom. The out-of-controlARL values and the corresponding 95% confidence intervals are listed in Table 2. Also listed in Table 2are the in-control ARL values and the corresponding 95% confidence intervals for shift equal to zero (in-con-trol process). Simulation results are available from the authors on request.

The ARL curve of the ENN in the case of observations independently distributed (white noise) is graph-ically depicted in Fig. 1 by a continuous bold line. A logarithmic scale for the run length axis has been

0 0.5 1 1.5 2 2.5 310

0

101

102

103

Ave

rage

Run

Len

gth

(AR

L)

Ave

rage

Run

Len

gth

(AR

L)A

vera

ge R

un L

engt

h (A

RL)

Ave

rage

Run

Len

gth

(AR

L)

ENN

SCCX

EWMAST

0 0.5 1 1.5 2 2.5 310

0

101

102

103

ENN

SCCX

EWMAST

0 0.5 1 1.5 2 2.5 310

0

101

102

103

Shift (unit of standard deviation) Shift (unit of standard deviation)

Shift (unit of standard deviation)Shift (unit of standard deviation)

ENN

SCC

XEWMAST

ARMAST

0 0.5 1 1.5 2 2.5 310

0

101

102

103

ENN

SCC

XEWMAST

ARMAST

a

c d

b

Fig. 3. ARMA(1,1) / > 0, h < 0. ARL curves vs. shift of the mean. ENN (bold line). SCC chart, X chart, EWMAST and ARMASTcharts (Table 4 of Wardell et al., 1994; Table 5 of Zhang, 1998 and Table 4 of Jiang et al., 2000). (a) (/,h) = (.95,�.45), (b) (/,h) =(.475,�.45), (c) (/,h) = (.95,�.9), (d) (/,h) = (.475,�.9).

512 M. Pacella, Q. Semeraro / Computers & Industrial Engineering 52 (2007) 502–520

adopted. Three additional ARL curves are also depicted in the same figure, each representing the performanceof a statistics-based control chart:

• SCC chart (tiny continuous line with circle markers).• Shewhart’s X chart (tiny dotted line with triangle markers).• EWMAST chart (tiny dashed line with square markers).

The ARL values for X chart were estimated in (Wardell et al., 1994, Table 4) by simulation (where the sim-ulation error is negligible). The ARL values of SCC chart were theoretically derived by (Wardell et al., 1994,Table 4). The ARL values for EWMAST chart were estimated in (Zhang, 1998, Table 5).

0 0.5 1 1.5 2 2.5 310

0

101

102

103

Shift (unit of standard deviation)

ENN

SCCX

EWMAST

0 0.5 1 1.5 2 2.5 310

0

101

102

103

Shift (unit of standard deviation)

ENN

SCCX

EWMAST

0 0.5 1 1.5 2 2.5 310

0

101

102

103

Shift (unit of standard deviation)

ENN

SCCX

EWMAST

0 0.5 1 1.5 2 2.5 310

0

101

102

103

Shift (unit of standard deviation)

ENN

SCCX

EWMAST

a

c d

b

Ave

rage

Run

Len

gth

(AR

L)

Ave

rage

Run

Len

gth

(AR

L)A

vera

ge R

un L

engt

h (A

RL)

Ave

rage

Run

Len

gth

(AR

L)

Fig. 4. ARMA(1,1) / < 0, h > 0. ARL curves vs. shift of the mean. ENN (bold line). SCC chart, X chart, EWMAST and ARMASTcharts (Table 4 of Wardell et al., 1994; Table 5 of Zhang, 1998 and Table 4 of Jiang et al., 2000). (a) (/,h) = (�.475, .9), (b) (/,h) =(�.95, .9), (c) (/,h) = (�.475, .45), (d) (/,h) = (�.95, .45).

M. Pacella, Q. Semeraro / Computers & Industrial Engineering 52 (2007) 502–520 513

Similar graphs are also reported in Figs. 2–7. In particular, Fig. 2 depicts performances of four ARMA(1,1)models obtained combining positive values of parameter / to positive values of parameter h. Simulationresults in the case of / positive and h negative, / negative and h positive, and both of the parameters negativeare depicted in Figs. 3–5, respectively. Finally, simulation results of unmixed AR and MA models arereported in Figs. 6 and 7, respectively.

An additional ARL curve (tiny dotted line with asterisk markers) is also depicted in the cases of parameters(/,h) = (�.95, .0), (�.475, .0), (.475, .0), (.95, .0), (.475,�.9), (.95,.45), (.95,�.9). This curve represents the ARLvalues for the ARMAST chart estimated by (Jiang et al. (2000), Table 4). Table 3 provides also numericalcomparison of the proposed NN-based scheme with the ARMAST chart.

0 0.5 1 1.5 2 2.5 310

0

101

102

103

ENN

SCCX

EWMAST

0 0.5 1 1.5 2 2.5 310

0

101

102

103

ENN

SCCX

EWMAST

0 0.5 1 1.5 2 2.5 310

0

101

102

103

Shift (unit of standard deviation) Shift (unit of standard deviation)

Shift (unit of standard deviation)Shift (unit of standard deviation)

Ave

rage

Run

Len

gth

(AR

L)A

vera

ge R

un L

engt

h (A

RL)

Ave

rage

Run

Len

gth

(AR

L)

ENN

SCCX

EWMAST

0 0.5 1 1.5 2 2.5 310

0

101

102

103

ENN

SCCX

EWMAST

a

c d

b

Fig. 5. ARMA(1,1) / < 0, h < 0. ARL curves vs. shift of the mean. ENN (bold line). SCC chart, X chart, EWMAST and ARMASTcharts (Table 4 of Wardell et al., 1994; Table 5 of Zhang, 1998 and Table 4 of Jiang et al., 2000). (a) (/,h) = (�.475,�.45), (b) (/,h) =(�.95,�.45), (c) (/,h) = (�.475,�.9), (d) (/,h) = (�.95,�.9).

514 M. Pacella, Q. Semeraro / Computers & Industrial Engineering 52 (2007) 502–520

In the case of uncorrelated time series, it can be observed from Fig. 1 that the proposed NN outperformboth the Shewhart’s X chart and SCC chart applied to independent data (consider that X and SCC chartsare equivalent in the case of uncorrelated observations, and hence the corresponding ARL curves appearssuperimposed). On the other hand, the EWMAST control chart has superior performance in signaling bothsmall and moderate shifts of the process mean, while the ENN, have better performance in signaling shiftsof the process mean equal to 2.5 and 3 units of standard deviation.

Table 4 summarizes the results of performance comparison. In the case of large and moderate positive auto-correlation where c1 > 0.3, e.g., for ARMA(1, 1) models where (/,h) = (.95, .0), (.95, .45), (.95,�.45),(.475, 0), (.475,�.45), it can be noted that the ENN may outperform the charting techniques for small andmedium shifts of the process mean. For moderate autocorrelation, e.g., (/,h) = (.95, .9), (�.95,�.9),

0 0.5 1 1.5 2 2.5 310

0

101

102

103

ENN

SCC

XEWMAST

ARMAST

0 0.5 1 1.5 2 2.5 310

0

101

102

103

ENN

SCC

XEWMAST

ARMAST

0 0.5 1 1.5 2 2.5 310

0

101

102

103

Shift (unit of standard deviation) Shift (unit of standard deviation)

Shift (unit of standard deviation)Shift (unit of standard deviation)

Ave

rage

Run

Len

gth

(AR

L)

Ave

rage

Run

Len

gth

(AR

L)A

vera

ge R

un L

engt

h (A

RL)

Ave

rage

Run

Len

gth

(AR

L)

ENN

SCC

XEWMAST

ARMAST

0 0.5 1 1.5 2 2.5 310

0

101

102

103

ENN

SCC

XEWMAST

ARMAST

a

c d

b

Fig. 6. Autoregressive processes. ARL curves vs. shift of the mean. ENN (bold line). SCC chart, X chart, EWMAST and ARMASTcharts (Table 4 of Wardell et al., 1994; Table 5 of Zhang, 1998 and Table 4 of Jiang et al., 2000). (a) (/,h) = (.95,0), (b) (/,h) = (.475,0),(c) (/,h) = (�.475,0), (d) (/,h) = (�.95,0).

M. Pacella, Q. Semeraro / Computers & Industrial Engineering 52 (2007) 502–520 515

(.475, .45), (�.475,�.45), the neural network may outperform the charting techniques or it performs compa-rably, in signaling large shifts of the process mean.

In the case of large negative autocorrelation where c1 < �0.3, e.g., for the autocorrelated processes(/,h) = (�.95, .0), (�.95,�.45), (�.95, .45),(�.95, .9), (�.475,0), it can be noted that the neural network outper-forms the charting techniques for large shifts in the process mean.

Table 5 provides further comparison of the proposed NN-based technique with another NN-based proce-dure recently proposed in the literature (Hwarng, 2004). The method is a feedforward NN-based approach(the Extended-Delta-Bar-Delta, EDBD, neural network). The ARL values of the EDBD neural network(Table 3 in Hwarng, 2004) are simulation results (where the simulation error is negligible). The ENN may out-perform the EDBD neural network for large shift of the process mean (s = 2,3) in the case of a first-orderautoregressive process with parameter / = .95.

0 0.5 1 1.5 2 2.5 310

0

101

102

103

ENN

SCCX

EWMAST

0 0.5 1 1.5 2 2.5 310

0

101

102

103

ENN

SCCX

EWMAST

0 0.5 1 1.5 2 2.5 310

0

101

102

103

Shift (unit of standard deviation) Shift (unit of standard deviation)

Shift (unit of standard deviation)Shift (unit of standard deviation)

ENN

SCCX

EWMAST

0 0.5 1 1.5 2 2.5 310

0

101

102

103

Ave

rage

Run

Len

gth

(AR

L)

Ave

rage

Run

Len

gth

(AR

L)A

vera

ge R

un L

engt

h (A

RL)

Ave

rage

Run

Len

gth

(AR

L)

ENN

SCCX

EWMAST

a

c d

b

Fig. 7. Moving average processes. ARL curves vs. shift of the mean. ENN (bold line). SCC chart, X chart, EWMAST and ARMASTcharts (Table 4 of Wardell et al., 1994; Table 5 of Zhang, 1998 and Table 4 of Jiang et al., 2000). (a) (/,h) = (0, .9), (b) (/,h) = (0, .45),(c) (/,h) = (0, .45), (d) (/,h) = (0,�.9).

516 M. Pacella, Q. Semeraro / Computers & Industrial Engineering 52 (2007) 502–520

7. Conclusions

Recently, researchers’ interest has been focused on using neural-based approaches to signal process changesalso in the case of autocorrelated data. In fact, with the widespread use of automated production and inspec-tion, various control tasks, traditionally performed by quality practitioners, have been automated. In thesecases, process data collected at high frequencies by automatic sensors may be autocorrelated and should beelaborated on-line by computer-based systems.

Models based on feedforward NNs have been proved to outperform several statistics-based charting tech-niques. However, the static mapping accomplished by this kind of networks requires a large number of inputneurons, complex structure and training phases, and thus necessitates a long computation time.

Table 3ENN simulation results (summarized from Table 2) and comparisons to ARMAST (mean values obtained from Jiang et al.,2000 – Table 4)

Shift / h ENN ARMAST

0 �.95 .0 369.45 [360.42,378.48] 370.5 2.59 [2.57,2.61] 2.65

1 1.43 [1.42,1.44] 1.422 1.00 [1.00,1.00] 1.003 1.00 [1.00,1.00] 1.00

0 �.475 .0 373.08 [366.73,379.44] 370.5 37.17 [36.40,37.95] 13.2

1 6.17 [6.08,6.25] 4.782 2.05 [2.04,2.07] 2.313 1.56 [1.55,1.57] 1.64

0 .475 .0 370.93 [363.27,378.60] 370.5 64.29 [63.16,65.42] 65.6

1 22.38 [22.04,22.72] 20.32 7.88 [7.78,7.99] 6.613 4.54 [4.49,4.59] 3.67

0 .95 .0 374.52 [366.98,382.07] 370.5 193.95 [189.68,198.23] 226

1 88.65 [86.92,90.37] 1022 20.64 [20.17,21.11] 25.83 5.28 [5.13,5.43] 8.65

0 .475 �.9 371.78 [362.71,380.85] 380.5 98.38 [96.50,100.26] 84.7

1 33.30 [32.59,34.00] 25.42 7.86 [7.71,8.01] 7.943 3.29 [3.25,3.32] 4.29

0 .95 .45 375.54 [368.42,382.65] 378.5 172.93 [168.45,177.41] 224

1 75.49 [73.94,77.05] 95.42 21.94 [21.62,22.25] 23.63 10.18 [10.03,10.33] 5.14

0 .95 �.9 368.03 [359.86,376.20] 370.5 209.67 [205.49,213.85] 42.8

1 97.86 [96.05,99.66] 1.002 23.80 [23.21,24.39] 1.003 6.88 [6.72,7.04] 1.00

Bold font values, cases in which ENN outperforms the ARMAST.

M. Pacella, Q. Semeraro / Computers & Industrial Engineering 52 (2007) 502–520 517

In this work, an innovative NN has been investigated. The simplest ENN has been proposed. The prin-cipal advantage of this approach, when compared to other neural-based schemes, is the simplicity of train-ing and implementing. When compared to statistics-based charting techniques, the main advantage of theproposed control schema is that there is no need to build a time series model. The ‘recurrency’ allows sucha network to remember cues from the recent past. Additionally, it has been demonstrated that it is notnecessary to organize the time series data by a window of appropriate size. This facilitates the use ofthe model, can lead to a deeper understanding of the process, and gives the user confidence in the moni-toring system.

A simulation study has been conducted, where several levels of autocorrelation were selected in order tocover the region of stationary time series models, including positively and negatively autocorrelated processes.

Table 4Performance comparisons amongst ENN, SCC, Shewhart’s X, EWMAST and ARMAST

h// .95 .475 0 �.475 �.95

Shift Best Shift Best Shift Best Shift Best Shift Best

.9 c1 = 0.0725 c1 = �0.2548 c1 = �0.4972 c1 = �0.7365 c1 = �0.97490.5 EWMAST 0.5 EWMAST 0.5 SCC 0.5 SCC 0.5 SCC1.0 EWMAST 1.0 EWMAST 1.0 SCC 1.0 SCC 1.0 SCC2.0 ENN 2.0 EWMAST 2.0 SCC 2.0 SCC 2.0 ENN

3.0 X 3.0 SCC 3.0 SCC 3.0 SCC 3.0 ENN

.45 c1 = 0.8237 c1 = 0.0254 c1 = �0.3742 c1 = �0.6888 c1 = �0.97130.5 ENN 0.5 EWMAST 0.5 EWMAST 0.5 EWMAST 0.5 SCC1.0 ENN 1.0 EWMAST 1.0 EWMAST 1.0 SCC 1.0 SCC2.0 X 2.0 EWMAST 2.0 ENN 2.0 SCC 2.0 ENN

3.0 SCC 3.0 X 3.0 SCC 3.0 SCC 3.0 ENN

0 c1 = 0.9500 c1 = 0.4750 c1 = 0.0000 c1 = �0.4750 c1 = �0.95000.5 ENN 0.5 ENN 0.5 EWMAST 0.5 ARMAST 0.5 ENN

1.0 ENN 1.0 ARMAST 1.0 EWMAST 1.0 ARMAST 1.0 SCC2.0 SCC 2.0 EWMAST 2.0 ENN 2.0 ENN 2.0 ENN

3.0 SCC 3.0 X 3.0 ENN 3.0 SCC 3.0 ENN

�.45 c1 = 0.9713 c1 = 0.6888 c1 = 0.3742 c1 = �0.0254 c1 = �0.82370.5 ENN 0.5 ENN 0.5 ENN 0.5 EWMAST 0.5 ENN

1.0 SCC 1.0 EWMAST 1.0 EWMAST 1.0 EWMAST 1.0 ENN

2.0 SCC 2.0 EWMAST 2.0 EWMAST 2.0 EWMAST 2.0 SCC3.0 SCC 3.0 SCC 3.0 X 3.0 ENN 3.0 SCC

�.9 c1 = 0.9749 c1 = 0.7365 c1 = 0.4972 c1 = 0.2548 c1 = �0.07250.5 SCC 0.5 ARMAST 0.5 EWMAST 0.5 EWMAST 0.5 EWMAST1.0 SCC 1.0 ARMAST 1.0 EWMAST 1.0 EWMAST 1.0 EWMAST2.0 SCC 2.0 SCC 2.0 EWMAST 2.0 EWMAST 2.0 EWMAST3.0 SCC 3.0 SCC 3.0 SCC 3.0 SCC 3.0 SCC

Bold font values, cases in which ENN outperforms the control charts. For each model, symbol c1 represents the autocovariance of theprocess.

Table 5ENN simulation results (summarized from Table 2) and comparisons to EDBD neural network (mean values obtained from Hwarng,2004 – Table 3)

Shift / h ENN EDBD

0 .0 .0 371.87 [365.00,378.74] 372.96.5 82.53 [80.85,84.22] 25.38

1 20.26 [19.89,20.62] 8.292 3.77 [3.72,3.82] 2.473 1.94 [1.92,1.95] 1.29

0 .475 .0 370.93 [363.27,378.60] 370.34.5 64.29 [63.16,65.42] 47.36

1 22.38 [22.04,22.72] 15.932 7.88 [7.78,7.99] 4.553 4.54 [4.49,4.59] 2.09

0 .95 .0 374.52 [366.98,382.07] 370.37.5 193.95 [189.68,198.23] 152.09

1 88.65 [86.92,90.37] 77.002 20.64 [20.17,21.11] 32.073 5.28 [5.13,5.43] 10.17

Bold font values, cases in which ENN outperforms the EDBD neural network.

518 M. Pacella, Q. Semeraro / Computers & Industrial Engineering 52 (2007) 502–520

Table 6Five-stage ENN implementation procedure

Stage Description Remarks

1 – Configuration of theENN

(i) Collect a sequence of adequate length ofobservations from the in-control process (500measurements are recommended).(ii) Implement the ENN, which is modeled by Eqs.(5) and (6).

2 – Training data set (i) The training data set is implemented by joiningtwo series: (1) the data collected in stage 1; (2) thedata collected in stage 1 shifted by a fixed value s

(3.5 units of process standard deviation is therecommended value for s).

The network is trained by two temporal patterns: (1)natural process data (training target yt = 0); (2)unnatural data (training target yt = 1) obtained byshifting the original data of a value s.

3 – ENN training (i) Use the back-propagation learning algorithm(with momentum and adaptive learning rate) totrain the ENN (minimizing the MSE with respectto the network weights).

Training goal: MSE < 0.001.

4 – Tuning data set (i) Disengage learning. No more weight adaptations are allowed in this stage.(ii) Based on the process, and the reference in-control ARL value, collect several independentsequences (at least 500 are recommended), ofadequate length (at least 2000 data arerecommended for ARL equal to 370) of in-controlprocess observations.

Each tuning sequence is obtained either by a series ofreal process data (when only natural causes ofvariation are in effect), or by a series of simulated data(when a parametric model is available).

(iii) Collect the sequence of ENN outcomes foreach tuning series.

Such data are collected in order to select the mostsuitable value of cutoff C that allows for a reference in-control ARL value.

5 – ENN tuning (i) Set an initial value for cutoff C and estimate theENN run length for each tuning series.

A binary search method, which iteratively divides thespace of admissible values for the cutoff (i.e., theinterval of real numbers between 0 and 1) in sub-intervals of half-length, can be implemented in orderto find an appropriate value of cutoff C that allows fora specific in-control ARL value.

(ii) Estimate the actual in-control ARL of theENN as mean value of such run lengths.(iii) If the actual in-control ARL value isstatistically greater (lesser) than the referencevalue, decrease (increase) the cutoff value andrepeat steps (i)–(iii). Otherwise, use C as thereference cut-off value.

M. Pacella, Q. Semeraro / Computers & Industrial Engineering 52 (2007) 502–520 519

Although the ENN does not outperform control charts in signaling mean shift for each autocorrelation model,this simple NN may perform better than other specialized charting schemes in several instances for both smalland large shifts. Such a simple recurrent NN (single input node, single context neuron and single output node)may be easily implemented and trained. This can be a practical result for actual applications. Examples ofpotential applications of the proposed ENN include the manufacture of wood, paper, cold-rolled steel, chem-icals products as well as the discrete processes in electronics assembly manufacturing industry.

In order to facilitate the ENN implementation in actual industrial cases, a five-stage procedure is summa-rized in Table 6, where the value of parameters are determined from for a desired in-control ARL value aboutequal to 370.

References

Al-Ghanim, A. (1997). An unsupervised learning neural algorithm for identifying process behavior on control charts and a comparisonwith supervised learning approaches. Computers & Industrial Engineering, 32, 627–639.

Alwan, L. C., & Roberts, H. V. (1988). Time-series modeling for statistical process control. Journal of Business & Economic Statistics, 6(1),87–95.

520 M. Pacella, Q. Semeraro / Computers & Industrial Engineering 52 (2007) 502–520

Alwan, L. C., & Roberts, H. V. (1995). The problem of misplaced control limits. Journal of the Royal Statistical Society, Series C, 44(3),269–306, with discussion and reply.

Barghash, M. A., & Santarisi, N. S. (2004). Pattern recognition of control chart using artificial neural networks – analyzing the effect of thetraining parameters. Journal of Intelligent Manufacturing, 15, 635–644.

Blanco, A., Delgado, M., & Pegalajar, M. C. (2001). A real-coded genetic algorithm for training recurrent neural networks. Neural

Networks, 14, 93–105.Box, G. E. P., Jenkins, G. M., & Reinsel, G. C. (1994). Time series analysis: Forecasting and control. Englewood Cliffs, NJ: Prentice-Hall.Chang, S. I., & Aw, C. A. (1996). A neural fuzzy control chart for detecting and classifying process mean shifts. International Journal of

Production Research, 34(8), 2265–2278.Cheng, C.-S., & Cheng, S.-S. (2001). A neural network-based procedure for the monitoring of exponential mean. Computers & Industrial

Engineering, 40, 309–321.Chiu, C.-C., Chen, M.-K., & Lee, K.-M. (2001). Shifts recognition in correlated process data using a neural network. International Journal

of Systems Science, 32(2), 137–143.Cohen, B., Saad, D., & Marom, E. (1997). Efficient training of recurrent neural networks with time delays. Neural Networks, 10(1), 51–59.Cook, D. F., & Chiu, C.-C. (1998). Using radial basis function neural networks to recognize shifts in correlated manufacturing process

parameters. IIE Transactions, 30, 227–234.Cook, D. F., Zobel, C. W., & Nottingham, Q. J. (2001). Utilization of neural networks for the recognition of variance shifts in correlated

manufacturing process parameters. International Journal of Production Research, 39(17), 3881–3887.Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14, 179–211.Guh, R. S., & Hsieh, Y. C. (1999). A neural network based model for abnormal pattern recognition of control charts. Computers &

Industrial Engineering, 36, 97–108.Guh, R. S., & Tannock, J. D. T. (1999). Recognition of control chart concurrent patterns using a neural network approach. International

Journal of Production Research, 37, 1743–1765.Haykin, S. (1994). Neural networks, a comprehensive foundation. New York: Macmillan College.Hwarng, H. B., & Hubele, N. F. (1993). Back-propagation pattern recognizers for X control charts: methodology and performance.

Computers & Industrial Engineering, 24(2), 219–235.Hwarng, H. B. (2004). Detecting process mean shift in the presence of autocorrelation: a neural network based monitoring scheme.

International Journal of Production Research, 42(3), 573–595.Hwarng, H. B. (2005). Simultaneous identification of mean shift and correlation change in AR(1) processes. International Journal of

Production Research, 43(9), 1761–1783.Jang, K.-Y., Yang, K., & Kang, C. (2003). Application of artificial neural network to identify non-random variation patterns on the run

chart in automotive assembly process. International Journal of Production Research, 41(6), 1239–1254.Jiang, W., Tsui, K.-L., & Woodall, W. H. (2000). A new SPC monitoring method: the ARMA chart. Technometrics, 42(4), 399–410.Montgomery, D. C. (2000). Introduction to statistical quality control (2nd ed.). New York: Wiley.Pacella, M., Semeraro, Q., & Anglani, A. (2004a). Manufacturing quality control by means of a Fuzzy ART network trained on natural

process data. Engineering Applications of Artificial Intelligence, 17(1), 83–96.Pacella, M., Semeraro, Q., & Anglani, A. (2004b). Adaptive resonance theory-based neural algorithms for manufacturing process quality

control. International Journal of Production Research, 40(21), 4581–4607.Pham, D. T., & Liu, X. (1996). Training of Elman networks and dynamic system modelling. International Journal of Systems Science,

27(2), 221–226.Pugh, G. A. (1991). A comparison of neural networks to SPC charts. Computers & Industrial Engineering, 21, 253–255.Ryan, T. P. (1991). Discussion (of ‘‘Some statistical process control methods for autocorrelated data’’ by D.C. Montogomery and C.M.

Mastrangelo). Journal of Quality Technology, 23(3), 200–202.Stone, R., & Taylor, M. (1995). Time series models in statistical process control: considerations of applicability. The Statistician, 44(2),

227–234.Tsoi, A. C., & Back, A. (1997). Discrete time recurrent neural network architectures: A unifying review. Neurocomputing, 15, 183–223.Wardell, D. G., Moskowitz, H., & Plante, R. D. (1994). Run-length distribution of special cause control charts of correlation processes.

Technometrics, 36(1), 3–17.Zhang, N. F. (1997). Detection capability of residual control chart for stationary process data. Journal of Applied Statistics, 24(4),

475–492.Zhang, N. F. (1998). A statistical control chart for stationary process data. Technometrics, 40(1), 24–38.Zorriassantine, F., & Tannock, J. D. T. (1998). A review of neural networks for statistical process control. Journal of Intelligent

Manufacturing, 9, 209–224.