Genetic algorithms for signal grouping in sensor validation: a comparison of the filter and wrapper...

IFE/HR/E – 2008/005

Genetic Algorithms for Signal Grouping in Sensor Validation: a Comparison of

the Filter and Wrapper Approaches

KJELLER HALDEN Address Telephone Telefax

NO-2027 Kjeller, Norway +47 63 80 60 00 +47 63 81 63 56

NO-1751 Halden, Norway +47 69 21 22 00 +47 69 21 22 01

Report number Date

IFE/HR/E-2008/005 2007-10-15 Report title and subtitle Number of pages

Genetic Algorithms for Signal Grouping in Sensor Validation: a Comparison of the Filter and Wrapper Approaches

18

Project/Contract no. and name ISSN

HRP 0807-5514 Client/Sponsor Organisation and reference ISBN

HRP 978-82-7017-708-0 (printed) 978-82-7017-709-7 (electronic)

Abstract

Sensor validation is aimed at detecting anomalies in sensor operation and reconstructing the correct signals of failed sensors, e.g. by exploiting the information coming from other measured signals. In field applications, the number of signals to be monitored can often become too large to be handled by a single validation and reconstruction model. To overcome this problem, the signals can be subdivided into groups according to specific requirements and a number of validation and reconstruction models can be developed to handle the individual groups. In this paper, multi-objective genetic algorithms (MOGAs) are devised for finding groups of signals bearing the required characteristics for constructing signal validation and reconstruction models based on principal component analysis (PCA). Two approaches are considered for the MOGA search of the signal groups: the filter and wrapper approaches. The former assesses the merits of the groups only from the characteristics of their signals, whereas the latter looks for those groups optimal for building the models actually used to validate and reconstruct the signals. The two approaches are compared with respect to a real case study concerning the validation of 84 signals collected from a Swedish boiling water nuclear power plant.

Keywords: sensor monitoring, signal validation, multi-objective genetic algorithms, fitter/wrapper, principal component analysis

Name Date Signature Author(s) P Baraldi, E Zio, G Gola, D Roverso,

and M Hoffmann 2007-10-15

Reviewed by J.E. Farbrot 2007-10-15

Approved by Ø. Berg 2007-10-15

Genetic algorithms for signal groupingin sensor validation: a comparisonof the filter and wrapper approachesP Baraldi1, E Zio1*, G Gola1, D Roverso2, and M Hoffmann2

1Department of Nuclear Engineering, Polytechnic of Milan, Milano, Italy2OECD Halden Reactor Project, Institutt for energiteknikk, Halden, Norway

The manuscript was received on 15 October 2007 and was accepted after revision for publication on 18 December 2007.

DOI: 10.1243/1748006XJRR137

Abstract: Sensor validation is aimed at detecting anomalies in sensor operation andreconstructing the correct signals of failed sensors, e.g. by exploiting the information comingfrom other measured signals. In field applications, the number of signals to be monitored canoften become too large to be handled by a single validation and reconstruction model. Toovercome this problem, the signals can be subdivided into groups according to specific require-ments and a number of validation and reconstruction models can be developed to handle theindividual groups. In this paper, multi-objective genetic algorithms (MOGAs) are devised forfinding groups of signals bearing the required characteristics for constructing signal validationand reconstruction models based on principal component analysis (PCA). Two approaches areconsidered for the MOGA search of the signal groups: the filter and wrapper approaches. Theformer assesses the merits of the groups only from the characteristics of their signals, whereasthe latter looks for those groups optimal for building the models actually used to validate andreconstruct the signals. The two approaches are compared with respect to a real case studyconcerning the validation of 84 signals collected from a Swedish boiling water nuclear powerplant.

Keywords: sensor monitoring, signal validation, multi-objective genetic algorithms, fitter/wrapper, principal component analysis

1 INTRODUCTION

Traditional approaches to sensor validation involveperiodic instrument calibration by procedures thatmay be expensive both in labour and process down-time, since many procedures require shutting downthe process, taking the instrument out of service,then loading and calibrating it. These latter tasksmay also lead to damage of the equipment and incor-rect calibrations owing to adjustments made undernon-service conditions.

On the other hand, the current quest for increasedeconomic competitiveness by all industries intro-duces the need for streamlining all plant operations,including instrument calibration. This motivates a

shift towards condition-based calibration (rather thanperiodic or worse yet, corrective) for recalibrating aninstrument only when its performance is degraded[1, 2]. This entails continuousmonitoring of the instru-ment performance. During plant operation, sensorsmay experience anomalies, which might lead them toconvey inaccurate or misleading information on theactual plant state to the automated controls and theoperators. Hence, it is important to develop accurateand robust systems capable of detecting such anoma-lies and correctly reconstructing the signals of thefailed sensors.

Benefits of this approach include the reduction ofunnecessary maintenance, more confidence in theactual values of the monitored parameters, withimportant feedbacks on system operation, produc-tion, and accident management [1,2].

In many field applications, the number of mea-sured signals is too large to be handled effectivelywith one single model for signal validation and

*Corresponding author: Department of Nuclear Engineering,

Polytechnic of Milan, Via Ponzio 34/3, Milan 20133, Italy.

email: [email protected]

Extended version of a paper originally presented at ESREL 2007.

189

JRR137 � IMechE 2008 Proc. IMechE Vol. 222 Part O: J. Risk and Reliability

reconstruction [2, 3]. One approach to address theproblem amounts to subdividing the set of signalsinto smaller overlapping groups, developing a valida-tion and reconstruction model for each group of sig-nals and then combining the outcomes of the modelswithin an ensemble approach [4–6] (Fig. 1).

The multi-group approach to sensor monitoringentails investigating the following two issues: (a)how to group the signals optimally and (b) how tocombine appropriately the outcomes of the modelsdeveloped using the groups.

The first issue of the multi-group approach (i.e.sensor grouping) is investigated in this work byresorting to genetic algorithms (GAs) [7–12] and prin-cipal component analysis (PCA) as the signal valida-tion and reconstruction model [13–17] (otherpossible models are the neural network partial leastsquares algorithm [18, 19], the auto-associativeneural networks [20–24] or neuro-fuzzy systems[25]).

Given n�1 sensors’ signals fi, i ¼ 1, 2,. . ., n, to bevalidated, the aim of the GA search is to group themin K groups, k¼ 1, 2,. . ., K, each one constituted bymk<n signals with some required characteristics.More specifically, the objective functions of thegenetic search should capture both the individualproperties of the groups (i.e. the mutual informationcontent of the group signals, the accuracy of thevalidation model built using the group signals andthe group size) and the global properties related tothe ensemble of groups (i.e. the diversity between thegroups, the redundancy of the signals for ensuringrobust validation, and the inclusion of the majority ofthe signals in the groups) [2, 3]. In the present work,the focus is on the individual characteristics of thegroups. Two search approaches are considered: the fil-ter [26, 27] and wrapper [28, 29] approaches. Thesemethods differ in the way the solutions iteratively pro-posed during the search are evaluated: in wrapperapproaches the performance of the model effectivelyused to validate and reconstruct the signals (in thiswork, the PCA) is one of the objective functions drivingthe search, whereas in filter methods the groups areevaluated with respect to the characteristics of thesignals independently of the specific algorithm usedfor signal validation (in the current paper, their corre-lation). In order to explore the potentialities of groupsof different sizes in the validation task, the maximiza-tion of the number of signals in the group is added toboth searches as the second objective.

The paper is organized as follows. In section 2,the main characteristics of the filter and wrapperapproaches are illustrated. Section 3 explains in detailthe objective functions adopted in the two methods.In section 4, the approaches are applied to a real casestudy concerning the validation of a data set of 84 sig-nals measured at a Swedish Boiling Water Reactorlocated in Oskarshamn. Some conclusions on theadvantages and limitations of the proposed methodsare drawn in the last section. Finally, in an attempt tomake the paper self-consistent Appendices A and Breport a brief synthesis of the basic concepts of GAsand PCA, respectively, and Appendix C provides thelist of the monitored signals.

2 THE FILTER AND WRAPPER APPROACHESTO SIGNAL GROUPING

The aim of this study is to develop a method for gen-erating groups of signals optimal for their use in sig-nal validation and reconstruction models. Twodifferent techniques for generating optimal groupsof signals have been investigated for comparison:the filter and wrapper approaches [26–29]. In the fil-ter approach (Fig. 2), the algorithm for searching theoptimal groups functions as a filter which inclu-des/discards the signals in the groups. The decisionof including or discarding a signal in a group is basedon characteristics judged to be (indirectly) favourablefor signal validation and reconstruction, indepen-dently of the specific model which is then used to

Fig. 1 The multi-group approach to sensor monitoring

Fig. 2 Scheme of the filter approach

Fig. 3 Scheme of the wrapper approach

190 P Baraldi, E Zio, G Gola, D Roverso, and M Hoffmann

Proc. IMechE Vol. 222 Part O: J. Risk and Reliability JRR137 � IMechE 2008

perform these tasks. The correlation between the sig-nals in the group is typically used as an indirect mea-sure for comparing the goodness of the groupsselected by the search engine [30].

Contrary to filter methods, in wrapper approaches(Fig. 3) the search algorithm behaves as a ‘wrapper’around the specific model used for the validationand reconstruction of the signals of the groups; dur-ing the optimization search, the performance of thevalidation and reconstruction model itself is directlyused as evaluation function to compare the differentgroups selected by the search engine [28, 29].

With respect to the search engine, three techniquesare commonly used: complete [26], heuristic [27, 28],and probabilistic [29]. The first two approaches haveshown limitations in scanning effectively the searchspace and finding the optimal solutions. In particu-lar, in the complete approach, the properties of apre-defined evaluation function must be used toprune the search space to a manageable size. Onlysome evaluation functions give rise to a search thatguarantees to find optimal solutions without beingexhaustive. Heuristic techniques such as sequentialforward selection (SFS) [31] or sequential backwardelimination (SBE) [31, 32] are iterative, experience-based methods in which, at every step, a number ofsolutions is obtained on the basis of those of the pre-vious step and evaluated in terms of an evaluationfunction. Although not exhaustive, such approacheshave proved unable to obtain optimal solutions,because they easily lead the search towards localoptima.

On the other hand, probabilistic approaches suchas GAs, or methods such as simulated annealingand tabu search algorithms [33], are founded onpopulation-based techniques in which the solutionsiteratively proposed during the search are selectedfrom those of the previous population with probabil-istic criteria accounting for their goodness in terms ofpre-defined objective functions. This way of proceed-ing has been proved capable of finding the optimalsolutions by efficiently scanning the search spacewithin an acceptable computational time.

In the current paper, both the filter and wrappersignal grouping approaches have been performed by

multi-objective genetic algorithm (MOGA) optimiza-tion searches, as illustrated in the next section.

A priori, the filter approach to signal grouping isexpected to be computationally more efficient thanthe wrapper one because for each set of signals oftrial considered by the search algorithm during theoptimization, the computation of indirect evaluationfunctions like the signal correlation is less time con-suming than the development and evaluation of thevalidation and reconstruction model directly used,as required by the wrapper approach. In practicalapplications, the wrapper approach may be imple-mented provided that a fast-computing validationand reconstruction model is used, e.g. PCA, auto-associative kernel regression methods [34, 35] andso on. On the other hand, wrapper approaches areexpected to be more performing than the filter onessince in the former the groups of signals found areoptimal for the specific validation and reconstructionmodel used, whereas in the latter the indirect evalua-tion of the groups’ goodness totally ignores the per-formance of the model actually used.

Table 1 summarizes the main features of the filterand wrapper approaches. The decision in practiceon whether to adopt a filter or a wrapper approachfor a particular application should be based on anexperimental verification of the actual computationalcosts and performances involved in the twoapproaches and on the associated preferential com-promise of the two conflicting objectives of computa-tional speed and performance.

3 MULTI-OBJECTIVE GENETIC ALGORITHMSFOR SIGNAL GROUPING

The problem of signal grouping is here framed as aMOGA optimization search [8–12, 29, 30] (seeAppendix A for further details). Within this scheme,the inclusion or not of a signal in a group can beencoded in terms of a binary variable which takesvalue 1 or 0, respectively. With respect to the problemof finding K groups of signals, the probabilistic searchis performed on a population of K chromosomes,each one constituted by n bits representing all the

Table 1 Main features of the filter and wrapper approaches

Evaluation function Computational cost Performance

Filter Indirect signals characteristics(e.g. correlation)

Low (fast-computingevaluation functions)

In general possibly good, although thegroups are selected based on indirect signals’characteristics, independently of thevalidation and reconstruction modelactually used

Wrapper Direct performance of validationand reconstruction model(e.g. reconstruction error)

High (training and testingof validation andreconstruction model)

In general better than filter, because the groupsare selected as optimal directly with respect tothe validation and reconstruction model used

Genetic algorithms for signal grouping in sensor validation 191


possible signals included in a group. In the generickth chromosome, coding group k, the ith bit encodesthe presence (1) or absence (0) of the ith signal in thekth group, i ¼ 1, 2,. . ., n, k ¼ 1,2,. . ., K (Fig. 4).

The filter and wrapper MOGA optimizationsdevised in this work, both consist of a two-objectivegenetic search which at convergence leads to a Paretofront constituted by optimal groups of different sizeswhich optimize the defined objectives. In the follow-ing, the features of the filter and wrapper techniqueshere developed are illustrated in further detail.

3.1 The filter MOGA optimization

As explained in section 2, in the filter approach, thesignal groups, i.e. the chromosomes, are evaluatedwith respect to their intrinsic characteristics, regard-less of the specific model used to carry out the valida-tion and reconstruction task. In this respect, the twoobjective functions here used to capture the relevantsignal group characteristics are the maximization ofthe signals correlation and the maximization of thegroup size [30].

The first objective function is intuitively motivatedby the fact that the signals in the group will then beused to build a model for their validation and recon-struction and by the conjecture that strongly posi-tively or negatively correlated signals are capable ofregressing one another. In fact, the information con-tent of strongly negatively correlated signals is alsovery high and comparable to the one derived fromstrongly positively correlated signals.

The measure herein used to quantify these charac-teristics is the Pearson’s correlation coefficient[36, 37]. Considering N measurements of two signalsfp (t) and fq (t), t ¼ 1, 2,. . ., N, the Pearson’s coefficientis defined as

corrp;q ¼ 1

N�1

XNt¼1

fpðtÞ�f pSfp

!fqðtÞ�f q

Sfq

!ð1Þ

where f p, Sfp , f q and Sfq are the mean values and stan-dard deviations of the signals fp and fq, respectively.

By definition, the value of corrp,q varies from �1 to1, being 0 for statistically independent quantities.Signals that have the same trend (both increasing orboth decreasing with respect to the mean) will havepositive values of corrp,q, whereas those which varyin opposite ways will render corrp,q negative.

To associate a degree of correlation to the generickth groupofmk signals,k¼ 1, 2,. . .,K, the average abso-lute correlation of each pth signal, p ¼ 1, 2,. . ., mk,is first computed as the mean of the absolute valuesof the correlation between the p-th and the remainingmk�1 signals, viz.

hcorrpi ¼ 1

mk�1

Xmk

q¼1q 6¼p

jcorrp;qj ð2Þ

Finally, the group correlation rk, computed as thegeometric mean of the average correlations of themk signals in the group [19], is taken as first objectivefunction for the group optimization

rk ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiYmk

p¼1

hcorrpimk

vuut ð3Þ

This measure allows assigning a low correlation tothose groups in which at least one signal has a lowaverage correlation with respect to the others in thegroup [19].

The choice of the second objective, i.e. the maximi-zation of the group size mk, is related to the purposeof obtaining a number of optimal groups of differentsizes. In fact, given a group k of mk signals with thehighest possible correlation rk associated to thatnumber of signals, adding one signal to the groupwould result in a group with lower correlation. Ifthis were not true and a group k

0of mk

0 signals withcorrelation rk0 higher than rk was obtained, it wouldbe possible to remove from k

0the signal with the low-

est average correlation and obtain a group k0 0of

mk ¼ mk signals with correlation rk > rk0 > rk whichwould be in contrast with the assumption that thegroup k of mk signals has the highest possible groupcorrelation. Therefore, if group k is the best possiblesolution corresponding to a number mk of signals,the group k

0constituted bymkþ1 signals will be char-

acterized either by the same or a lower group correla-tion. Since the group correlation is generally higherwhen groups are constituted by fewer signals, thechosen objective functions are in conflict and,according to the concepts of Pareto optimality anddominance [8–10, 29, 30], at convergence of thesearch, we expect to find a Pareto front constitutedby a large number Kf of highly correlated groups ofdifferent sizes, mk, k¼ 1, 2,. . .,Kf .

3.2 The wrapper MOGA optimization

The wrapper approach aims at obtaining groupswhich are optimal for the model that is used forvalidating and reconstructing the signals, in thiscase the PCA. The first objective function is then theminimization of the reconstruction error of the PCA

Fig. 4 The structure of the generic kth chromosome



validation and reconstruction model to which themaximization of the group size is added as secondobjective, in analogy with the filter approach.

Concerning the first objective function, for eachgroup k considered during the search, a PCA-basedvalidation and reconstruction model is trained on aset of NTrn samples of the group signals fi(t), i¼ 1,2,. . .,mk, t¼ 1, 2,. . ., NTrn and evaluated on a test setof NTst samples. For the evaluation, the average abso-lute reconstruction error «i of signal fi, i¼ 1, 2,. . .,mk

is computed on the NTst samples

«i ¼ 1

NTst

XNTst

t¼1

jfiðtÞ�f iðtÞj ð4Þ

where fi(t) and f iðtÞ are the signal real and PCA-reconstructed values, respectively.

The reconstruction performance of the PCA-basedmodel built with the signals of group k is the meanof the average absolute reconstruction errors «i,i¼ 1, 2,. . .,mk, viz.

«k ¼ 1

mk

Xmk

i¼1

«i ð5Þ

The chosen arithmetic mean form of the groupreconstruction error «k allows associating high errorsto those groups in which at least one signal is not

Fig. 5 The process diagram of the nuclear power plant in Oskarshamn. The 84 signals are identified by analphanumeric code (e.g. 312 KA301, 423 KB509, etc.), see Appendix C.

Table 2 Partition of the total data set for the MOGAoptimization and model validation

Time samplings(t ¼ 1, 2, . . ., N) Signals (i ¼ 1, 2, . . ., n)

NM¼ 14966

1 f1(1) f2(1) . . . f84(1)–14 966

NV¼ 15 114

14 967–30 080 f1(30 080) f2(30 080) . . . f84(30 080)

Table 3 Main parameters used for the MOGA searches.For further details on the FIT–FIT selectionprocedure and the FITTEST replacement pro-cedure the interested reader may consult [7, 8]

Selection FIT–FIT: the population, rank-ordered onthe basis of the Pareto dominancecriterion, is scanned and each individualis parent-paired with an individual of thenext fittest Pareto rank class

Replacement FITTEST: out of the four individuals(two parents and two children)involved in the crossover procedure,the fittest two replace the parents

Mutation probability 10�2

Population size 100Number of generations 30 000



properly reconstructed, i.e. it has a high reconstruc-tion error «i.

As in the filter approach, the second objective is themaximization of the number of signals mk. This ismotivated by the conjecture that small groups of sig-nals provide more accurate models [2, 3, 30]. Theconflict between the two objectives chosen isexpected to lead at convergence to a Pareto frontconstituted by Kw groups of different sizes, optimalfor the PCA-based validation and reconstruction.

In the application illustrated in the next section, thegroups found by the filter andwrapper approaches arecompared to assess the effectiveness of the two devel-oped methods.

4 APPLICATION

The filter and wrapper grouping optimizations havebeen applied to a real case study concerning the vali-dation of n¼ 84 signals which have been collectedfrom a nuclear Boiling Water Reactor (BWR) locatedin Oskarshamn, Sweden (Fig. 5).

The MOGA code used for the calculations has beendeveloped by the Laboratorio di Analisi di Segnale edi Analisi di Rischio (LASAR, Laboratory of Analysisof Signals and Analysis of Risk) of the Departmentof Nuclear Engineering of the Polytechnic of Milan(http://lasar.cesnef.polimi.it). The PCA code hasbeen taken from http://lib.stat.cmu.edu/multi/pcaand adapted to perform the signal validation andreconstruction tasks.

The 84 signals have been sampled every 10minfrom 31 May 2005 to 5 January 2006 from a corre-sponding number of sensors (see Appendix C), pro-viding a total amount of N¼ 30 080 time samplings.Each recording instant t provides an 84-dimensionalpattern f1(t),f2(t),. . .,f84(t), t¼ 1,. . .,N, identified bythe values of the n¼ 84 signals at the time instant t,as illustrated in Table 2.

The total number of available patterns is dividedinto a MOGA set XM of NM¼ 14 966 samples used toperform the genetic optimization, i.e. to calculatethe group correlation equation (3) and reconstruc-tion error equation (5), and a validation set XV ofNV¼ 15 114 samples used to validate the models builtusing the signals of the groups.

The MOGA set is further divided into a trainingðXTrn

M Þ and test ðXTstM Þ sets of NTrn

M ¼ 12 000and NTst

M ¼ 2966 signal samples, respectively, to trainand test the PCA validation models during the wrap-per search. In order to simplify the development ofthe PCA models, the training and test subsets havebeen reduced by sampling one pattern every 100 forthe training set and one every 10 for the test set.The MOGA reduced training ðX 0Trn

M Þ and test ðX 0TstM Þ

sets are thus constituted by N0TrnM ¼ 120 and

N 0TstM ¼ 296 signal samples, respectively. Note that

previous tests have shown that training the modelson these reduced sets does not appreciably affecttheir performances in terms of reconstruction error,while it considerably speeds up the model construc-tion and thus the entire wrapper search.

Cross-validation has not been applied either in thewrapper or in the filter MOGA search for not increas-ing the computational effort; on the contrary, whenvalidating the models (section 4.1), a 10-fold cross-validation has been performed using the signal sam-ples of the validation set.

In the following, the results of the filter and wrappersearches are illustrated and compared. Table 3 showsthe MOGA settings here adopted in both approaches,with a synthetic explanation of the terms used. Duringthewrapper search, the number l of principal compo-nents retained to reconstruct the group signals hasbeen set equal to half the size of the group under eva-luation, for this represents a good compromisebetween the accuracy of the model and its computa-tional cost (see Appendix B for further details).

In the wrapper search, the PCA models built usingthe signals of the groups are trained on theN 0Trn

M ¼ 120 samples, whereas the group reconstruc-tion error is computed by equation (5) on theN 0Tst

M ¼ 296 samples of the test set. For a fair compar-ison basis, also the correlation of the groups selectedduring the filter search is computed according toequation (3) using only the N 0Trn

M ¼ 120 patterns ofthe MOGA training set.

Owing to the size of the search space (2n possiblesolutions), the devised wrapper MOGA search wasnot able to converge to a Pareto front of solutionscontaining also small-sized groups. To overcomethis problem, the wrapper MOGA search has beensplit into two sub-searches by imposing the maxi-mum size of the groups to be considered during the

Table 4 Results of the filter and wrapper MOGA searches

MOGA searchesNumber of optimalgroups

Minimumsize

Maximumsize

Computationaltime (h)

Filter 69 16 84 0.53Wrapper WSMALL 17 10 42 14.2

WBIG 45 18 84 35.5WBIG[WSMALL 49 10 84 49.7



http://lasar.cesnef.polimi.it

http://lib.stat.cmu.edu/multi/pca

optimization. Hence, one MOGA search (WBIG) hasbeen devised to scan the solution space for findinggroups with sizes up to 84 signals and one (WSMALL)has been constrained to search the small groupregion for solutions up to 42 signals.

Table 4 summarizes the results of the filter andwrapper MOGA searches and Figs 6(a) and (b) showthe corresponding Pareto fronts at convergence. Thefilter search has found Kf ¼ 69 optimal groups in avery short computational time. Concerning the wrap-per approach, the search WBIG has found also somegroups with few signals, but these solutions aredominated by those with the same number of signalsobtained by the constrained search WSMALL, thusconfirming the usefulness of splitting the searchinto two. In this case, for each pair of solutions over-lapping in terms of number of signals, the one withsmaller reconstruction error has been retained inthe final combined wrapper Pareto front, of Kw¼ 49groups. As expected, owing to the training and testing

of many validation models, the wrapper MOGAsearch has required a large computational effort.

In Figs 7(a) and (b) the Pareto fronts obtained bythe two MOGA searches are compared. In particular,in Fig. 7(a) the wrapper groups are evaluated withrespect to the group correlation rk equation (3) usingthe N

0TrnM ¼ 120 patterns of the MOGA training set,

whereas in Fig. 7(b) the filter groups are evaluatedin terms of the reconstruction error «k equation (5)using the MOGA training and test sets previouslyadopted in the wrapper search. By doing so, the com-parison of the filter and wrapper groups becomesstraightforward.

Although the wrapper groups have a low correla-tion (Fig. 7(a)), they provide a higher accuracy inreconstructing the signals of the MOGA test set

a)

Fig. 7(a) Comparison of the filter and wrapper groupsusing the MOGA training set. Each group isidentified by the group correlation rk and thegroup size mk

a)

Fig. 6(a) The filter Pareto front: each solution, k¼ 1, 2,. . .,Kf, is identified by the value of the group correla-tion rk equation (3) and group size mk

b)

Fig. 6(b) The combined wrapper Pareto front(WBIG[WSMALL): each solution is identified bythe value of the group reconstruction error «kequation (5) and group size mk

b)

Fig. 7(b) Comparison of the filter and wrapper groupsusing the MOGA training and test sets. Eachgroup is identified by the reconstruction error«k and the group size mk



(Fig. 7(b)), thus showing that increasing the correla-tion among the signals in the group does not necessa-rily guarantee the best performance of the validationmodel trained and tested on those signals. Further-more, in Fig. 7(b) it is proved that models built usingfewer signals are generally more accurate since theinformation conveyed by some signals can be uselessor detrimental for the reconstruction of the others,thus supporting the validity of the idea of the multi-group approach to signal validation. Finally, noticethat useful mutual information can be related, onone side, to high signal correlations (as in the filtergroups), but, on the other, it can be also intended asthe correct amount of information carried by thesignals for reconstructing one another with respectto the adopted validation and reconstruction model,as for the poorly correlated (Fig. 7(a)) but more accu-rate (Fig. 7(b)) wrapper groups.

4.1 Validating and testing therobustness of the PCA models

The PCA models built using the signals of the groupsfound by the MOGA must be validated on a differentdata set (XV). As anticipated in the previous section,the validation procedure is performed over a 10-foldcross-validation. In analogy with the partition of theMOGA set, the validation set has been divided into atraining set ðXTrn

V Þ and a test set ðXTstV Þ of

NTrnV ¼ 12 000 andNTst

V ¼ 3114 signal samples, respec-tively. By sampling one pattern every 100 from thetraining set and one every 10 from the test set, areduced validation set (X 0

V) of N0V ¼ 431 signal sam-

ples has been obtained. Then, for each cross-valida-tion N 0Trn

V ¼ 120 patterns are randomly sampledfrom X0

V to form the reduced training set X 0TrnV upon

which the PCA models are built; the remainingN 0Tst

V ¼ 311 patterns constitute the reduced testset X 0Tst

V used to test the robustness of the trainedmodels.

During plant operation, a model must be faulttolerant, i.e. able to reconstruct the signals when inpresence of sensor failures, e.g. drifts, by providing agood estimate of the correct output corresponding tothe faulty sensor without deteriorating too much theother outputs of the network associated to correct sig-nals, thus avoiding the so-called spillover effect[20–22]. Within the multi-group, PCA-based model-ling scheme proposed here, when one of the sensorsof a group is faulty, it sends in input to the correspond-ing PCA validation model a faulty signal; in this situa-tion, the model should provide as output a goodestimate of the true value of the faulty signal, thanksto the information provided by the other signals com-ing from the non-faulty sensors of the group.

To verify fault tolerance, at each cross-validation,after training the PCA model with the N 0Trn

V samples

of the mk signals of the generic group k, a drift (simu-lating a sensor failure) is imposed to the N 0Tst

V sam-ples of one signal of the group. The disturbance isset so as to decrease linearly the value of the signalup to 75 per cent of its real value. The PCA model isthen fed with the whole test set X 0Tst

V in which one sig-nal is linearly drifted and the others are correct. Thecorresponding group reconstruction error when theith signal is faulty, «

ðiÞk , is then computed as equation

(5). By drifting one group test signal at a timeand averaging the group reconstruction errors «

ðiÞk ,

i¼ 1, 2,. . .,mk, one obtains the average drift groupreconstruction error, «�k.

Two components can be distinguished in «�k: thedrift error «�dk committed on the drifted signal andthe spillover effect error «�sk affecting the undriftedsignals. To obtain «�dk , we compute the average ofthe reconstruction errors «�iji on the i-th drifted signalin group k, i¼ 1, 2,. . .,mk, viz.

«�dk ¼ 1

mk

Xmk

i¼1i2k

«�iji ð6Þ

The error «�sk is instead calculated by averaging thereconstruction errors spilling onto the undrifted sig-nals, when the ith signal is drifted, i¼ 1, 2,. . .,mk, viz.

«�sk ¼ 1

mk

Xmk

i¼1i2k

1

mk�1

Xmk

p¼1p2k;p 6¼i

«�pji

0BBBBBBB@

1CCCCCCCA

ð7Þ

In Fig. 8, the results of the validation and robustnesstests for the filter and wrapper groups are illustrated.

Fig. 8 Validation of the filter and wrapper groups withoutdrifts (stars and dots, respectively) and with drifts(diamonds and circles, respectively). The procedureis cross-validated using the validation set



For visual clarity, the cross-validation statisticalerrors are not reported on the figure. In general, thecross-validated performances of the filter and wrap-per groups are either comparable or superior to thoseachieved on the MOGA set, without cross-validation.This is attributable to the fact that by randomly sam-pling the training and test patterns, many test pointsare likely to be near the training ones, thus easingthe reconstruction of their value. On the contrary,if the data set is linearly divided, the test points areplaced progressively farther from the training set,their reconstruction becoming more difficult.Nevertheless, the cross-validation procedure equallyaffects the two methods since both filter andwrapper group-based PCA models are built on thesame training and test points randomly sampled.

Comparing the performances of the filter and wrap-per PCAmodels tested on new signals, the accuracy ofthose based on the filter and wrapper groups becomescomparable. This suggests that the superior perfor-mance of the wrapper groups on the MOGA test set(Fig. 7(b)) might be caused by an over-fitting of thedata and that highly correlated filter groups are alsocapable of providing reliable models. Concerning thereconstruction capabilities on drifted signals,although wrapper groups are in general slightly morerobust than the filter ones, their reconstruction errorsare still comparable.

Notice that the smaller the group size, the lessrobust the models; in fact, small groups suffer morethe effect of drifts since less information is availableto correctly reconstruct the failed signal. This aspectentails a trade-off between having less complex mod-els trained on fewer signals and more robust modelscapable of providing efficient reconstructions of thedisturbed signals.

4.2 Single group analysis

The filter and wrapper approaches are further com-pared by choosing two groups of the same size from

the corresponding Pareto fronts and investigating indetail the signal reconstruction capabilities of therespective PCA models. As explained in the previoussections, the choice of the group size is critical forthe ensembling of groups, since too many signalscan degrade the reliability of the models and on theother hand the model robustness worsens with toofew signals. On this basis and considering the resultsin Fig. 8, the 30-signal filter and wrapper groups arechosen as a good compromise for developing reliableand robust validation and reconstruction models.Table 5 reports the features and performances ofthe two groups and Figs 9(a) and 9(b) show theN 0TrnþTst

M ¼ 416 samples of the 30 signals of the filterand wrapper groups, respectively.

The groups have 16 signals in common (identifiedin bold in Table 5). The signals in the filter groupare quite similar among themselves (Fig. 9(a)) andthus very correlated. The performance of the filtergroup-based PCA model on the MOGA set X 0

M ispoorly accurate. On the other hand, the error on thevalidation set X 0

V becomes comparable to that ofthe wrapper group-based PCA model in both casesof undrifted and drifted signals, the wrapper groupperforming slightly better than the filter one. Notethat, despite the highly correlated signals, the filtergroup commits larger errors in reconstructing thedrifted signals (as shown by the higher value of thedrifting error), while the wrapper group ensuressmaller spillover effects, thus providing a more stablevalidation and reconstruction model.

A possible explanation of the higher reconstructioncapabilities of the less correlated signals of the wrap-per group may be found within the internal structureof the PCA validation models. In fact, as shown inFig. 10, although both models use 15 principal com-ponents for performing the mapping-demappingprocess (as explained in Appendix B) the filter modelmainly exploits one component (identified by thefirst eigenvalue which cumulates more than 95 percent of the total variance of the data set), whereas in

Table 5 Features and performances of the filter and wrapper groups. In bold are the signals common to both groups

Number ofsignals, mk Signal labels

Groupcorrelation, rk

Group reconstruction error (·10�1)

MOGA set,X

0M

Validation set, X0V

«k «�k

Filter 30

7 8 9 10 11 12 13 2231 32 34 35 41 42 43 4445 47 63 64 65 69 70 7173 74 76 77 78 79

0.9975 0.1219 0.0519 – 0.0618

0.6721 – 0.3039«�dk «�sk5.4398 0.5076

Wrapper 30

1 4 5 7 8 9 11 13 1522 23 32 33 34 35 43 4549 50 54 58 60 66 72 7376 77 78 79 82

0.5163 0.0379 0.0451 – 0.0159

0.4859 – 0.1524«�dk «�sk4.2265 0.3570



the wrapper model four components are necessary tocumulate the same variance. Having many diverseand relevant elements in the base by which signalsare reconstructed increases the flexibility of themodel, which results in a superior ability of capturingthe trends of the unseen signal samples.

Figures 11 and 12 illustrate in detail the reconstruc-tion performances of the two approaches. In particu-lar, Fig. 11 shows the reconstruction errors of the 16signals common to the two groups and Fig. 12 the

group errors «�ðiÞk obtained when drifting the ith

common signal.Despite a larger group error owing to high errors

on the different signals (Fig. 8), the filter model per-forms slightly better on the common undrifted sig-nals (Fig. 11). Indeed, the high mutual informationbetween these signals and the remaining ones allowsa more accurate reconstruction of the signals. Onthe contrary, when signals are drifted, the lack ofdiverse information within the filter model leads to

b)

Fig. 9(b) The N0TrnþTst

M samples of the 30 signals of the wrapper group

a)

Fig. 9(a) The N0TrnþTst

M samples of the 30 signals of the filter group



a more imprecise reconstruction of the signals(Fig. 12).

5 CONCLUSIONS

A practical issue in sensor diagnostics on large-scalesystems is the number of signals monitored. Cur-rently, it is not possible to manipulate efficiently allthese signals as a whole by one single validationand reconstruction model. One way to overcomethis problem is to group the signals into smaller over-lapping sets and then develop a number of tailoredmodels for signal validation and reconstruction,whose outcomes are combined within an ensembleapproach.

An effective multi-group approach to signal valida-tion must consider both the individual properties ofthe groups (i.e. the mutual information between thesignals, the performances of the models built usingthe group signals and the number of signals) andthose related to the global grouping structure, suchas having diverse groups, ensuring a good redun-dancy of the signals and including the majority ofthe signals in the ensemble.

This work is focused on the task of building therequired properties into the signal groups. Two dif-ferent MOGA approaches have been investigated forcomparison: a filter search based on the maximiza-tion of the correlation between the group signalsand a wrapper search based on the minimization ofthe reconstruction error of the adopted validationmodel (in this work, the principal component analy-sis). A second objective function has been consideredregarding the maximization of the group size, forobtaining many optimal groups of different sizes. Byso doing, at convergence of the searches, a numberof optimal groups with different characteristics con-stitute the Pareto fronts: in these, the filter groupsare highly correlated, whereas the wrapper groupsensure small reconstruction errors.

The comparison of the two approaches has beencarried out on a real case study concerning the vali-dation of signals collected from a nuclear BoilingWater Reactor located in Sweden.

The PCAmodels built by using the groups found bythe filter and wrapper MOGA approaches, have beencompared on a new validation data set in terms of theaccuracy in reconstructing the signals and therobustness ensured when anomalies or disturbances(such as drifts) affect the sensor measurements.Finally, one filter and one wrapper groups of thesame size have been selected from the correspondingPareto fronts for further comparison.

The quantitative comparison confirms that the fil-ter approach is computationally more efficient thanthe wrapper one and that the latter selects groups

Fig. 12 Group reconstruction errors when drifting onecommon signal at a time using the validation set

Fig. 11 Reconstruction errors of the common signalsundrifted and drifted using the validation set

Fig. 10 Cumulative variance associated to the eigenvaluesof the filter and wrapper validation model orderedwith respect to their values



which ensure slightly better performances, also ondrifted signals, but at the expenses of high computa-tional costs. In any case, the small reconstructionerrors, in absolute sense, of the fast-running filterapproach seem to make it more feasible for a full-fledged application. On the other hand, it is expectedthat as the complexity of the problem increases, i.e.more signals are considered, the wrapper approachwould provide significantly better reconstructionperformances than the filter one, although at evenhigher computational costs owing to the develop-ment of models based on more signals. In practice,a good compromise needs to be sought between thetwo conflicting objectives of low computational costsand high validation and reconstruction perfor-mances, which depend on the particular application.

The scope of this work does not address the overallgrouping of signals for validation and reconstruction.The proper ensembling of the groups to effectivelycover all signals for which validation and reconstruc-tion is required is the subject of future research. Yet,the groups here found could be used for the ‘crude’development of an ensemble by manually choosingfew large groups in a way to include all signals, thusdeveloping a small number of complex validationmodels, or many small groups which include themajority of the signals with a high redundancy thustraining fewer, less parameterized models.

Alternatively, one may run as many (filter or wrap-per) genetic optimizations as the number of signalsto validate and reconstruct, imposing during eachsearch the presence of one specific signal in thegroups. At the end of each search, only one groupcharacterized by the optimal features according tothe specifics of the problem (size, reconstructionerror or signal correlation) and including the speci-fied signal is retained in the ensemble. By doing so,the ensemble will be constituted by a number ofgroups equal to the number of signals (includingeach signal in at least one group), which can beused to develop a corresponding number of valida-tion and reconstruction models, although a homoge-neous signal redundancy and the diversity betweenthe groups is not guaranteed a priori.

REFERENCES

1 Hoffmann, M. On-line monitoring for calibrationreduction. HWR-784, OECD Halden Reactor Project,October 2005.

2 Hoffmann, M. Signal grouping algorithm for animproved on-line calibration monitoring system.Proceedings of FLINS, Genova, Italy, 2006.

3 Roverso, D. Solutions for plant-wide on-line calibrationmonitoring. Proceedings of ESREL 2007. Stavanger,Norway, 2007, Vol. 1, pp. 827–832.

4 Perrone, M. P. and Cooper, L. N. When networksdisagree: ensemble methods for hybrid neural networks.1992 (National Science Foundation, USA).

5 Krogh, A. and Vedelsby, J. Neural network ensembles,cross-validation and active learning. Advances innewel information processing systems, (Eds G. Tesauro,D. S. Touretzky and T. K. Loen) 1995, Vol. 7, pp. 231–238 (MIT press, Cambridge, MA, USA).

6 Sharkey, A. J. C. On combining artificial neural nets.Connection Sci., 1996, 8(3), 299–314.

7 Holland, J. H. Adaptation in natural and artificial sys-tems: an introductory analysis with application to biol-ogy. Control and artificial intelligence, 4th edition,1975 (MIT Press, Cambridge, MA, USA).

8 Goldberg, D. E. Genetic algorithms in search, optimiza-tion, and machine learning, 1989 (Addison-Wesley,New York).

9 Chambers, L. Practical handbook of genetic algorithms:applications Vol. I; new frontiers Vol. II, 1995 (CRC Press,London).

10 Sawaragy, Y., Nakayama, H., and Tanino, T. Theory ofmultiobjective optimization, 1985 (Academic Press,Orlando, Florida).

11 Raymer, M. L., Punch, W. F., Goodman, E. D.,Khun, L. A., and Jain, A. K. Dimensionality reductionusing genetic algorithms. IEEE Trans. EvolutionaryComputation, 2000, 4(2), 164–171.

12 Bozdogan, H. Statistical data mining with informationalcomplexity and genetic algorithm. Statistical datamining and knowledge discovery, (Ed. H. Bozdogan),2003 (CRC Press, London).

13 Jolliffe, I. T. Principal component analysis, 2002(Springer).

14 Diamantaras, K. I. and Kung, S. Y. Principal componentneural networks: theory and applications, 1996 (JohnWiley, New York).

15 Scholkopf, B., Smola, A., and Muller, K. R. Kernelprincipal component analysis. Advances in KernelMethods–Support Vector Learning, 1999, pp. 327–352(MIT Press, Cambridge, USA).

16 Moore, B. Principal component analysis in linear sys-tems: controllability, observability, and model reduc-tion. IEEE Trans. Automatic Control, 1981, 26(1).

17 Baldi, P. and Hornik, K. Neural networks and principalcomponent analysis: learning from examples withoutlocal minima. Neural Networks, 1989, 2(1), 53–58.

18 Hoffmann, M. and Kirschner, A. PEANO–findings fromusing the NNPLS algorithm and HAMMLAB applica-tions. HWR-690, OECD Halden Reactor Project, August2004.

19 Kirschner, A. and Hoffmann, M. PEANO NNPLS:advancements in 2002-03. HWR-741, OECD HaldenReactor Project. April 2004.

20 Fantoni, P. F. and Mazzola, A. Multiple-failure signalvalidation in nuclear power plants using artificial neuralnetworks. Nuclear Technol., 1996, 113(3), 368–374.

21 Fantoni, P. F. Experiences and applications of Peano foronline monitoring in power plants. Progress in nuclearenergy, 2005, Vol. 46, pp. 206–225, (Elsevier).

22 Marseguerra, M., Zio, E., and Marcucci, F. A soft-computing based classification procedure for the iden-tification of transients in the steam generator of a



pressurized water reactor. Annls Nuclear Energy, 2004,31, 1429–1446.

23 Kramer, M. A. Autoassociative neural networks.Computers Chem. Engng, 1992, 16(4), 313–328.

24 Hines, J. W., Uhrig, R. E., and Wrest, D. J. Use of auto-associative neural networks for signal validation.J. Intelligent and Robotic Systems, 1998, 21(2), 143–154.

25 Fantoni, P. F. A neuro-fuzzy model applied to full rangesignal validation of PWR nuclear power plant data. Int. J.General Systems, 2000, 29(2), 305–320.

26 Duran, B. and Odell, P. Cluster analysis: a survey, 1974(Springer Verlag, New York).

27 Zio, E., Baraldi, P., and Roverso, D. An extendedclassifiability index for feature selection in nuclear tran-sients. Annls Nuclear Energy, 2005, 32, 1632–1649.

28 Kohavi, R. and John, G. Wrappers for feature subsetselection. Artif. Intelligence, 1997, 97, 273–324.

29 Zio, E., Baraldi, P., and Pedroni, N. Selecting featuresfor nuclear transients classification by means of geneticalgorithms. IEEE Trans. Nuclear Sci., 2006, 53(3), 1479–1493.

30 Gola, G., Zio, E., Baraldi, P., Roverso, D., andHoffmann, M. Signal grouping for sensor validation:a multi-objective genetic algorithm approach. HWR-852, OECD Halden Reactor Project, Feb. 2007.

31 Cotter, S. F., Adler, R., Rao, R. D., and Kreutz-Delgado,K. Forward sequential algorithms for best basis selec-tion. IEEE Proc. Vision, Image Signal Processing, 1999,146(5), 235–244.

32 Mao, K. Z. Orthogonal forward selection and backwardelimination algorithms for feature subset selection.IEEE Trans. Systems, Man, Cybernetics, Part B, 2004,34(1), 629–634.

33 Zhang, H. and Sun, G. Feature selection using tabusearch method. Pattern Recognition, 2002, 35, 701–711,2002.

34 Girard, S. and Lovleff, S. Auto-associative models andgeneralized principal component analysis. J. Multivari-ate Analysis, 2005, 93(1), 21–39.

35 Nowicki, D. and Dekhtyarenko, O. Kernel-based asso-ciative memory. Proceedings of International Joint Con-ference on Neural Networks, Vol. 1, 2004, pp. 741–744.

36 Lawrence, I. and Kuei, L. A concordance correlationcoefficient to evaluate reproducibility. Biometrics,1989, 45(1), 255–268.

37 Hunt, R. J. Percent agreement, Pearson’s correlation,and kappa as measures of inter-examiner reliability.J. Dental Res., 1986, 65, 128–130.

38 Marseguerra, M. Lecture notes on principal componentsanalysis (PCA), Polytechnic of Milan.

APPENDIX A. MULTIPLE-OBJECTIVEGENETIC ALGORITHMS (MOGAs)FOR SIGNAL GROUPING

Genetic algorithms (GAs) [7–9] are probabilistic opti-mization methods aimed at finding the global opti-mum of a set of real objective functions, F � fwð·Þg,

of one or more decision variables, Z � fzg, possiblysubject to various linear or non-linear constraints.

GAs owe the name to their operational similaritieswith the biological and behavioural phenomena ofliving beings. They operate on a set of (artificial)chromosomes, which are strings of numbers, gener-ally sequences of binary digits 0 and 1. When theobjective function is evaluated in correspondence ofa set of values encoded in a chromosome, its valuesare called the fitness of that chromosome. Thus,each chromosome gives rise to a trial solution to theproblem.

The GA search is performed by constructing asequence of populations of chromosomes, the indivi-duals of each population being the children of thoseof the previous population and the parents of thoseof the successive population. The initial populationis generated by randomly sampling the bits of allthe strings. At each step, the new population isthen obtained by manipulating the strings of theold population by repeatedly performing the fourfundamental operations of reproduction, crossover,replacement, and mutation (all based on randomsampling) in order to arrive at a new populationhopefully characterized by an increased mean fit-ness. This way of proceeding enables to efficientlyarrive at optimal or near-optimal solutions.

In a multi-objective optimization problem, severalpossibly conflicting objective functions wl(·), l¼ 1,2,. . .,nw, must be evaluated in correspondence ofeach decision variable vector Z in the search space.The goal is to identify the solution vector Z0 whichgives rise to the best compromise among the variousobjective functions. The comparison of solutions isachieved in terms of the concepts of Pareto optimal-ity and dominance [8–10, 29, 30]. The decision vari-able vectors (i.e. the chromosomes) that are non-dominated within the entire search space are said tobe Pareto optimal and constitute the so called Paretooptimal front.

In Pareto-based methods, once a population ofchromosomes has been created, these are rankedaccording to the Pareto dominance criterion by look-ing at the nw-dimensional space of the fitness func-tions wl(Z), l¼ 1, 2,. . .,nw. At the end of the searchprocedure, the result of the optimization is consti-tuted by an archive made up by the chromosomesof the Pareto optimal front.

The problem of signal grouping regards the task ofdiscerning out of the several signals of the monitoredprocess or system those to be grouped together totrain an efficient model for signal validation andreconstruction.

Mathematically, the problem of signal groupingcan be formulated as an optimization problem aimedat finding the optimal signal groups with respect to a



set of objectives, e.g. the correlation between thegroup signals, the performance of the validation andreconstruction model built using the group signals,the group size, the diversity among groups, the inclu-sion of the majority of the signals and the signalredundancy.

Operatively, the total number of available n-dimensional data patterns are partitioned into a set(hereafter denoted by XM) used for the signal group-ing task and a separate set of approximately thesame size (hereafter denoted by XV), used for signalvalidation. Then, a GA can be devised to find an opti-mal set of binary transformation vectors W *, ofdimension n, which operates on XM to maximize/minimize the chosen set of optimization criteriaregarding the group of signals.

In this respect, letm be the number of 1s inW * andn�m the number of 0s: then, a modified set of pat-terns YM¼W *(XM) is obtained in an m-dimensionalspace (m<n). The set YM of modified patternsthereby obtained is used to evaluate the group per-formances in terms of the objective functions. TheGA creates a population of competing transformationvectors (chromosomes) Wk, k¼ 1, 2,. . . which areevaluated [8–12, 29, 30]. Each bit of the kth chromo-some is associated with a signal and interpreted suchthat if the ith bit equals 1, then the ith signal isincluded in the kth group or viceversa if the bit is 0.The quality of the kth group, encoded as the transfor-mation vector Wk, is measured by the set of objectivefunctions and on the basis of this feedback, the GAconducts its probabilistic search for a vector or a setof vectors which give rise to the best compromiseamong the objective functions.

APPENDIX B. PRINCIPAL COMPONENTANALYSIS (PCA) FOR SIGNAL VALIDATIONAND RECONSTRUCTION

In this Appendix, a brief outline is given of the proce-dural steps of principal component analysis (PCA) aspresented in reference [38]. The central idea of PCA isto reduce the dimensionality of a data set consistingof a large number of interrelated variables, whileretaining as much as possible of the variation presentin the data set. This is achieved by transforming to anew set of variables, the principal components(PCs), which are uncorrelated and ordered so thatthe first few retain the most of the variation presentin all of the original variables [13].

In this view, let {fi(t), t¼ 1, 2,. . .,N, i¼ 1, 2,. . .,m} bea set of N observations in an m-dimensional spaceRm. The purpose of the PCA is to identify a l-dimen-sional (l<m) subspace Rl�Rm in which the most ofthe data set variation is retained and the least infor-mation is lost.

From a mathematical point of view, let X� (N,m)be the data set matrix whose rows ft� (1,m), t¼ 1,2,. . .,N, are the patterns of the m observations, i.e.the m signal values at the time instant t, viz.

X ¼f11 f12 ::: f1mf21 f22 ::: f2m

:::fN1 fN2 ::: fNm

0BB@

1CCA ¼

f 1f 2:::f N

0BB@

1CCA � ðN ;mÞ ðB1Þ

so that Xti¼ fti, t¼ 1, 2,. . .,N, i¼ 1, 2,. . .,m, is the ithcomponent of ft in the original basis.

Let P�(m,m)2Rm be a matrix constituted by morthonormal column vectors pi� (1,m), i¼ 1, 2,. . .,mbuilt from the data set X according to an optimalitycriterion to be defined later and representing anorthonormal basis for the data set

P ¼p11 p12 . . .p1m

p21 p22 . . .p2m

...pm1 pm2 . . .pmm

0BB@

1CCA ¼ ð p1 p2 . . . pmÞ

� ðm;mÞ ðB2Þso that Pij¼pij, i¼ 1, 2,. . .m, j¼ 1, 2,. . .,m is the jthcomponent of pi in the original basis and, for theorthonormality of the basis vectors, pT

i ·pj ¼ dij orPT·P¼ Im, where Im is the unit matrix of order m.

In the orthonormal basis, let uti be the componentof ft along pT

i , so that

ftj ¼Xmi¼1

utipij ðB3Þ

and

f t ¼Xmi¼1

uti pTi ¼ ðut1 ut2 . . .utmÞ

pT1

pT2

. . .pTm

0BB@

1CCA ¼ utP

T

ðB4Þwhere ut�(1,m),t¼ 1, 2,. . .,N, is the tth m-dimen-sional pattern constituted by the m signal values attime t in the orthonormal basis.

Right multiplying equation (B4) by P yields

ut ¼ f tP ðB5Þand in matrix form

U ¼ XP ðB6Þwhere U� (N,m) is the matrix whose N rows ut�(1,m) are the coordinates of ft, t¼ 1, 2,. . .,N, in theorthonormal basis.

The data set has now two representations: whenintended in the original basis, the tth pattern is thevector ft with components fti; when intended in theorthonormal P basis, the same tth pattern is a vectorut with component uti along pi. In this view, once theorthonormal P basis has been fixed, equation (B5)provides ut as a function of ft. To calculate the inverse



relation, right multiply by PT and obtain the data setX in the original basis, viz

X ¼f 1

f 2

. . .f N

0BB@

1CCA ¼

u1

u2

. . .uN

0BB@

1CCA·PT ¼ U·PT � ðN ;mÞðm;mÞ

ðB7ÞIn this view, equations (B6) and (B7) represent thetransformation laws of the observation patterns Xbetween the original reference system and the ortho-normal basis. Notice that up to this point the equa-tions are exact and the data values are transformedin both senses without any loss of information.

The PCA approximation consists in mapping theobservation vectors ft in a subspace Rl�Rm identi-fied by l<m vectors chosen according to a criterionexplained later among the pi, i¼ 1, 2, . . .,m.

Without loss of generality, assume that the basisvectors are ordered in such a way that the selected lvectors are the first ones in P, i.e. (p1 p2 . . .pl). Corre-spondingly, the matrices P and U are partitioned asfollows

P ¼ ðPlPm�lÞ and U ¼ ðUlUm�lÞwhere Pl� (m, l) and Ul� (N, l) are the submatricesconstituted by the first l columns of P and U, respec-tively, and Pm�l� (m,m�l) and Um�l�(N,m�l) arethe submatrices constituted by the lastm�l columnsof P and U, respectively. The column vectors in Pl

and Pm�l constitute the bases of the two mutuallyorthogonal subspaces Rl and Rm�l in which Rm

has been divided. In terms of the above submatrices,equation (B7) can be rewritten as

X ¼ ðUl Um�lÞ PTl

PTm�l

� �¼ Ul·PT

l þUm�l·PTm�l

¼ ~X þUm�l·PTm�l

ðB8Þwhere ~X is defined as

~X¼Ul·PTl ¼

u11u12 :::u1l

u21u22 :::u2l

:::uN1uN2 :::uNl

0BB@

1CCA

pT1

pT2

:::pTl

0BB@

1CCA¼

~f 1~f 2

:::~f N

0BB@

1CCA

�ðN ;mÞ ðB9Þ

The tth row of X, namely ~f t , is the orthonormal pro-jection of ft onto Rl and then it may be expressedby as a linear combination of the vectors of the Pl

basis, viz.

~f t ¼ ðut1 ut2 ::: utlÞpT1

pT2

:::

pTl

0BB@

1CCA ¼

Xli¼1

uti·pTi ðB10Þ

and

~ftj ¼Xli¼1

uti·pij ðB11Þ

is the jth component, j¼ 1, 2,. . .,m, of ~f t in the origi-nal basis in Rm expressed through the componentsuti and pij, i¼ 1, 2,. . ., l of ut and pj in Rl.

If all the information about the data set X essen-tially lies in a l-dimensional space Rl (apart fromsmall components in Rm�l given by Um�l· P

Tm�l as

stated in equation (B8) then the data analysis canbe performed in Rl reducing the dimension of thedata set to handle by a factor l/m. To this aim, eachobservation vector ft�Rm is approximated by itsorthonormal projection ~f t 2 Rl plus a residual vec-tor in Rm�l which is postulated to be independentof t, viz.

~fappxt ¼ ~f t þ

Xmi¼lþ1

bipTi ðB12Þ

The best residual vector is that one which, on theaverage, minimizes the absolute value of the squareerror between the real {ft} and approximatedf ~f appxt g data patterns, i.e.

E ¼ 1

2

XNt¼1

kft� ~fappxt k2 ðB13Þ

By combining equations (B4), (B10), and (B12), theerror between the two vectors can be written as

ft� ~f appxt ¼Xmi¼lþ1

ðuti�biÞ·pTi ðB14Þ

and

kft� ~fappxt k2 ¼ ðf t� ~f

appxt Þðft� ~f

appxt ÞT

¼Xmi¼lþ1

ðuti�biÞpTi

Xmj¼lþ1

ðutj�bjÞpj

ðB15ÞThe expression for the error becomes then

E ¼ 1

2

XNt¼1

Xmi¼lþ1

ðuti�biÞ2 ðB16Þ

The best constants are those that minimize the errorand are determined by the conditions

@E

@bi¼ �

XNt¼1

ðuti�biÞ ¼ 0 ðB17Þ

Since the constants bi, i¼ lþ1,. . .,m, do not dependon t, using equation (B4)

bi ¼ 1

N

XNt¼1

uti ¼ 1

N

XNt¼1

ft

!pi ¼ f pi ðB18Þ



where the vector f is the arithmetic average of theobservation vectors, i.e. the average value of the sig-nals. In particular, the ith component of f is thearithmetic average of the ith column of X.

Then, from equation (B12), the expression for thePCA approximation of the data pattern ft is given by

~fappxt ¼ ~f t þ f

Xmi¼lþ1

pipTi ðB19Þ

or in matrix form

~Xappx ¼ ~X þ X Pm�l PT

m�l ðB20ÞThe problem to tackle at this point is how to choosean orthonormal basis P in Rm and how to selectamong the m columns pi of P the l vectors whichconstitute the basis of Rl. As demonstrated andexplained in detail in references [13, 14, 38], by sub-stituting equations (B4) and (B18) into equation(B16), the minimum error corresponding to the coef-ficients bi, i¼ lþ1,. . ., m can be written as

Emin ¼ 1

2Tr½ PT

m�lV Pm�l� ðB21Þ

where V represents the covariance matrix of X even-tually positive definite (so that its eigenvalues arereal and positive) [13] and can be written as

V ¼XNt¼1

ðft�f ÞT ðft�f Þ ¼ ðX�XÞT ðX�XÞ� ðm;NÞðN ;mÞ ðB22Þ

In order to find the l vectors which will constitute Pl,minimize Emin with respect to Pm�l by resorting tothe Lagrange multiplier approach [13]. The purposeis to find those m�l vectors which minimize Emin

subject to the constraint of being orthonormal. TheLagrange function in terms of the submatrix Pm�l

can be written as

L¼1

2Tr½PT

m�lVPm�l��1

2Tr½Lm�lðPT

m�lPm�l�Im�lÞ�ðB23Þ

where Lm�l� (m�l) (m�l) is the matrix of theLagrange coefficients, namely Lij, and Im�l is theunit matrix of order m�l.

Differentiating L with respect to Pm�l and settingthe result to zero [13]

V Pm�l ¼ Pm�lLm�l ðB24ÞOne solution to this equation is to choose Lm�l to bediagonal, i.e. (Lm�l)ij ¼ Li · dij, so that the columns ofPm� l are the eigenvectors of V corresponding to theeigenvalues Li, i ¼ lþ1,. . .,m. Notice that since theeigenvalues have been supposed simple, the eigen-vectors are orthogonal and may be normalized.

By substituting equation (B24) into (B23) it isdeduced that the required minimum of the Lagran-gian is

L ¼ Emin ¼ 1

2Tr½Lm�l� ¼ 1

2

Xmi¼lþ1

Li ðB25Þ

In principle, any set of m� l eigenvectors can consti-tute the orthonormal basis in Rm� l, but from equa-tion (B25) it appears that the best choice is to selectthe smallest m�l eigenvalues among the m possibleeigenvalues in V. To this aim, the m eigenvalues areranked in decreasing order so that

L1 >L2 > . . . >Lm

The eigenvectors are correspondingly ranked and wechoose for the basis Pl of Rl the first l eigenvectorsand for the basis Pm�l of Rm�l the remaining m�lones. The amount of information lost by considering~Xappx

instead of X may be quantified for the indivi-dual observations by the differences f t� ~f

appx

t orglobally by the fraction of neglected eigenvalues,

namely 2Emin=Pmi¼1

Li.

Finally, coming to the problem of signal validationand reconstruction by means of a PCA-based model,in order to simplify the calculations, the time trendsof the signals have been previously normalized sothat their mean is zero and their standard deviationequals 1. This allows to skip the computation of theresiduals, since f ¼ 0 and, according to equation(B19), ~f

appx

t ¼ ~f t .Furthermore, for each group k constituted by mk

signals, the eigenvectors constituting the orthonormalbasis Pl, have been obtained by equation (B24) fromthe covariance matrix V of the pairwise signal correla-tions computed as equation (1) for the mk signals.

APPENDIX C. LIST OF THE MONITORED SIGNALS

RangeNo. Tag Sub-system Measurement Unit Min Max

1 112KB502 Cooling water channels Temperature �C -0.5 402 112KB503 Cooling water channels Temperature �C 0 503 211KW560 Reactor pressure vessel Temperature �C 0 3004 260KW012 Fuel comp. calc. Carry-under % -2 55 260KW205 Fuel comp. calc. Diff. Pressure MPa -0.05 0.26 260KW302 Fuel comp. calc. Flow kg/s 0 14 000



7 260KW316 Fuel comp. calc. Flow kg/s 0 2008 260KW907 Fuel comp. calc. Power % 0 1409 260KW951 Fuel comp. calc. Power MW 0 400010 312KA301 Feed water lines Flow kg/s 0 110011 312KA502 Feed water lines Temperature �C 0 25012 312KC301 Feed water lines Flow kg/s 0 110013 312KC502 Feed water lines Temperature �C 0 25014 321KB301 Residual heat removal Flow kg/s 0 13015 321KB501 Residual heat removal Temperature �C 0 30016 321KB506 Residual heat removal Temperature �C 0 30017 331KB301 Reactor water clean-up Flow kg/s 0 5018 331KB302 Reactor water clean-up Flow kg/s 0 5019 331KB502 Reactor water clean-up Temperature �C 0 15020 332KB221 Condensate clean-up Diff. Pressure MPa 0 0.621 354KB301 Hydraulic scram System Flow kg/s 0 1522 403KA101 Turbine Pressure MPa 0 8.523 421KA101 Turbine plant main steam syst. Pressure MPa 0 924 421KA102 Turbine plant main steam syst. Pressure MPa 0 925 421KA508 Turbine plant main steam syst. Temperature �C 0 30026 421KB101 Turbine plant main steam syst. Pressure MPa 0 927 421KB102 Turbine plant main steam syst. Pressure MPa 0 928 421KB508 Turbine plant main steam syst. Temperature �C 0 30029 421VA001 Turbine plant main Steam syst. Valve % 0 10030 421VB006 Turbine plant main steam syst. Valve % 0 10031 422KA103 Steam reheating Pressure MPa 0 132 422KA109 Steam reheating Pressure MPa 0 133 422KA110 Steam reheating Pressure MPa 0 134 422KA111 Steam reheating Pressure MPa 0 135 422KA501 Steam reheating Temperature �C 0 30036 422KA504 Steam reheating Temperature �C 0 30037 422KA505 Steam reheating Temperature �C 0 30038 422KB504 Steam reheating Temperature �C 0 30039 422KB505 Steam reheating Temperature �C 0 30040 423KA506 Steam extraction Temperature �C 0 10041 423KB101 Steam extraction Pressure MPa 0 3.542 423KB102 Steam extraction Pressure MPa 0 3.543 423KB501 Steam extraction Temperature �C 0 30044 423KB502 Steam extraction Temperature �C 0 30045 423KB503 Steam extraction Temperature �C 0 30046 423KB509 Steam extraction Temperature �C 0 30047 423KB512 Steam extraction Temperature �C 0 30048 424KA501 Grand seal and leakage steam Temperature �C 0 15049 441KA509 Main cooling water Temperature �C 0 6050 441KB509 Main cooling water Temperature �C 0 6051 441KC511 Main cooling water Temperature �C 0 6052 441PC001 Main cooling water Pump % 0 12053 461KA105 Condenser and vacuum Pressure MPa abs 0 0.154 462KA102 Condensate Pressure MPa 0 455 462KA109 Condensate Pressure MPa 0 456 462KA111 Condensate Pressure MPa -0.1 1.457 462KA114 Condensate Pressure MPa 0 458 462KA503 Condensate Temperature �C 0 6059 462KA504 Condensate Temperature �C 0 6060 462KA505 Condensate Temperature �C 0 6061 462KA506 Condensate Temperature �C 0 10062 462KA507 Condensate Temperature �C 0 10063 462KB115 Condensate Pressure MPa -0.1 1.564 462KB116 Condensate Pressure MPa -0.1 0.365 462KB301 Condensate Flow kg/s 0 135066 462KB305 Condensate Flow kg/s 0 167.867 462KB511 Condensate Temperature �C 0 15068 462KB512 Condensate Temperature �C 0 15069 462KB515 Condensate Temperature �C 0 15070 462KB516 Condensate Temperature �C 0 15071 462KD302 Condensate Flow kg/s 0 60072 462VB134D1.1 Condensate Pressure Bool 0 173 463KA503 Turbine plant feedwater Temperature �C 0 20074 463KA504 Turbine plant feedwater Temperature �C 0 30075 463KB104 Turbine plant feedwater Pressure MPa 0 12.576 463KB501 Turbine plant feedwater Temperature �C 0 30077 463KB503 Turbine plant feedwater Temperature �C 0 20078 463KB504 Turbine plant feedwater Temperature �C 0 300

APPENDIX C Continued



RangeNo. Tag Sub-system Measurement Unit Min Max

79 463KB507 Turbine plant feedwater Temperature �C 0 30080 689KA901 Energy measuring syst. Power MWh 0 130081 689KA911 Energy measuring syst. Power MWe 0 1441.582 722KB301 Secondary cooling water syst. Flow kg/s 0 10083 722KB503 Secondary cooling water syst. Temperature �C 0 15084 722KB508 Secondary cooling water syst. Temperature �C 0 60

APPENDIX C Continued



Genetic algorithms for signal grouping in sensor validation: a comparison of the filter and wrapper...

Documents

Transcript of Genetic algorithms for signal grouping in sensor validation: a comparison of the filter and wrapper...