Fault detection in rolling element bearings using wavelet-based variance analysis and novelty...

16
Article Fault detection in rolling element bearings using wavelet-based variance analysis and novelty detection Aleksandra Ziaja 1 , Ifigeneia Antoniadou 2 , Tomasz Barszcz 1 , Wieslaw J Staszewski 1 and Keith Worden 2 Abstract Fractal signal processing and novelty detection are used for fault detection in rolling element bearings. The former applies the concept of self-similarity based on wavelet variance, and the latter is based on machine learning and utilises artificial neural networks. The method is demonstrated using simulated and experimental vibration data. The work presented involves validation both on laboratory test rig data and industrial wind turbine data. The results show that the method can be used successfully for automated fault detection in ball bearings under real operational conditions. Keywords Fault detection, fractal theory, novelty detection, rolling element bearing, self-similarity, wavelet-based variance 1. Introduction Vibration-based methods are prevalent in industrial applications for machine condition monitoring. This is relevant particularly in the field of rotating machin- ery where a broad range of different fault types and various approaches in the time, frequency and combine time-frequency domains have been developed over the last 40 years, as discussed in the literature (Carden and Fanning, 2004; Randall, 2011). A wide range of these studies focuses on fault detection in rolling element bearings due to a common applicability of these com- ponents in the vast majority of rotating machinery. Also, bearing failures are one of the most common causes of breakdown in rotating machines. The very basic techniques – often used in industrial applications – assess the condition of monitored bear- ing elements using statistical parameters that are calcu- lated globally, i.e. from entire vibration characteristics and/or their power spectra. Commonly used param- eters include: root mean square amplitude, peak-to-peak amplitude, crest factor and kurtosis. An increase in the value of these parameters at constant operating conditions is frequently associated with fault development, hence they are used as fault indica- tors. Frequency-domain methods analyze bearing char- acteristic frequencies related to certain fault types. These frequencies can be estimated theoretically from bearing geometry (Randall and Antoni, 2011). However, typically any change of simple time- and fre- quency-domain features at an early stage of defect development is too small to be uniquely and reliably identified. This is particularly relevant when machines in non-stationary operations are monitored (Fakhfakh et al., 2012). Various theoretical models have been developed for rolling element bearing faults (McFadden and Smith, 1984a; Wang and Kootsookos, 1998). These models can be used for model-based approaches in bearing fault detection (Lou et al., 2004). More fruitful fault detection approaches are based on the analysis of amplitude modulations of high fre- quency resonances of bearings. These modulations are caused by periodic excitations caused by bearing 1 Department of Robotics and Mechatronics, AGH University of Science and Technology, Krakow, Poland 2 Department of Mechanical Engineering, University of Sheffield, Sheffield, UK Corresponding author: Wieslaw J Staszewski, Department of Robotics and Mechatronics, AGH University of Science and Technology, Al. Mickiewicza 30, 30-059 Krakow, Poland. Email: [email protected] Received: 16 September 2013; accepted: 15 March 2014 Journal of Vibration and Control 1–16 ! The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/1077546314532859 jvc.sagepub.com at University of Sheffield on June 12, 2015 jvc.sagepub.com Downloaded from

Transcript of Fault detection in rolling element bearings using wavelet-based variance analysis and novelty...

XML Template (2014) [15.5.2014–10:21am] [1–16]//blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/JVCJ/Vol00000/140079/APPFile/SG-JVCJ140079.3d (JVC) [PREPRINTER stage]

Article

Fault detection in rolling element bearingsusing wavelet-based variance analysis andnovelty detection

Aleksandra Ziaja1, Ifigeneia Antoniadou2, Tomasz Barszcz1,Wieslaw J Staszewski1 and Keith Worden2

Abstract

Fractal signal processing and novelty detection are used for fault detection in rolling element bearings. The former applies

the concept of self-similarity based on wavelet variance, and the latter is based on machine learning and utilises artificial

neural networks. The method is demonstrated using simulated and experimental vibration data. The work presented

involves validation both on laboratory test rig data and industrial wind turbine data. The results show that the method

can be used successfully for automated fault detection in ball bearings under real operational conditions.

Keywords

Fault detection, fractal theory, novelty detection, rolling element bearing, self-similarity, wavelet-based variance

1. Introduction

Vibration-based methods are prevalent in industrialapplications for machine condition monitoring. Thisis relevant particularly in the field of rotating machin-ery where a broad range of different fault types andvarious approaches in the time, frequency and combinetime-frequency domains have been developed over thelast 40 years, as discussed in the literature (Carden andFanning, 2004; Randall, 2011). A wide range of thesestudies focuses on fault detection in rolling elementbearings due to a common applicability of these com-ponents in the vast majority of rotating machinery.Also, bearing failures are one of the most commoncauses of breakdown in rotating machines.

The very basic techniques – often used in industrialapplications – assess the condition of monitored bear-ing elements using statistical parameters that are calcu-lated globally, i.e. from entire vibration characteristicsand/or their power spectra. Commonly used param-eters include: root mean square amplitude,peak-to-peak amplitude, crest factor and kurtosis. Anincrease in the value of these parameters at constantoperating conditions is frequently associated withfault development, hence they are used as fault indica-tors. Frequency-domain methods analyze bearing char-acteristic frequencies related to certain fault types.These frequencies can be estimated theoretically from

bearing geometry (Randall and Antoni, 2011).However, typically any change of simple time- and fre-quency-domain features at an early stage of defectdevelopment is too small to be uniquely and reliablyidentified. This is particularly relevant when machinesin non-stationary operations are monitored (Fakhfakhet al., 2012). Various theoretical models have beendeveloped for rolling element bearing faults(McFadden and Smith, 1984a; Wang andKootsookos, 1998). These models can be used formodel-based approaches in bearing fault detection(Lou et al., 2004).

More fruitful fault detection approaches are basedon the analysis of amplitude modulations of high fre-quency resonances of bearings. These modulations arecaused by periodic excitations caused by bearing

1Department of Robotics and Mechatronics, AGH University of Science

and Technology, Krakow, Poland2Department of Mechanical Engineering, University of Sheffield, Sheffield,

UK

Corresponding author:

Wieslaw J Staszewski, Department of Robotics and Mechatronics, AGH

University of Science and Technology, Al. Mickiewicza 30, 30-059 Krakow,

Poland.

Email: [email protected]

Received: 16 September 2013; accepted: 15 March 2014

Journal of Vibration and Control

1–16

! The Author(s) 2014

Reprints and permissions:

sagepub.co.uk/journalsPermissions.nav

DOI: 10.1177/1077546314532859

jvc.sagepub.com

at University of Sheffield on June 12, 2015jvc.sagepub.comDownloaded from

XML Template (2014) [15.5.2014–10:21am] [1–16]//blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/JVCJ/Vol00000/140079/APPFile/SG-JVCJ140079.3d (JVC) [PREPRINTER stage]

element defects. The classical instantaneous frequencyand envelope analysis (McFadden and Smith, 1984b) –based on the Hilbert transform – is the best example inthis area. Recent developments related to demodulationanalysis include work on optimal filtering (Barszcz andJablonski, 2011).

The arrival of time-frequency (McFadden andWang, 1991; Forrester, 1992; Staszewski, andTomlinson, 1993; Staszewski et al., 1997) and time-scale (Staszewski. and Tomlinson, 1994; McFaddenand Wang, 1996; Mori et al., 1996) analysis to condi-tion monitoring in the 1990s brought new solutions forfault detection in rotating machinery. Wavelet analysishas been established in particular as a widespread toolfor fault detection since that time. A vast amount ofdifferent methods based on wavelets for fault detectionin ball bearings can be found in the literature (Luoet al., 2003; Chiementin et al., 2008; Li et al., 2008).The major advantage of these methods relates to thefact that most bearing faults produce signal featuresthat are local in time and can be effectively analyzedin certain frequency (or scale) ranges.

Noisy vibration characteristics from bearings oftenexhibit chaotic behavior (Brown et al., 1994) and localself-similarity (Staszewski et al., 1999; Staszewski, 2000).The latter behavior – associated with fractals – has alsobeen exploited for fault detection in ball bearings.Wavelet analysis (Staszewski, 2000), correlation dimen-sion (Logan andMathew, 1996; Craig et al., 2000; Yanget al., 2007), box counting (Nelwamondo et al., 2006)and the morphological cover technique (Li et al., 2012)have been used for fault detection in rolling elementbearings. Although, the work in Staszewski et al.(1999) and Yang et al. (2007) demonstrates that oftenreal measurements do not exhibit constant fractaldimensions at all analyzed scales (or frequency bands),signal processing with fractals can provide valuableinformation on embedded fault features. It is wellknown that contact stresses at the interface between roll-ing elements and races are usually very high, thereforeany abrupt change of theses stresses, e.g. when a rollingelement passes over the spall, applies an impulse force atthe spall. This in turnmay excite resonances, which oftenfall in the several kilohertz frequency range, in the bear-ing races and/or the machine. The wideband nature ofvibration responses from bearings, noise and periodiccomponents resulting from the operation tend to hidebearing defect signatures. It is well known that a waveletis a high-pass filter that can smooth out the noise at largewavelet levels. By using different scales, it is possible toestimate the signal-to-noise ratio (SNR) and use the factthat large values of SNR at different scales indicate theexistence of transients in the signals analyzed, as demon-strated by Staszewski et al. (1999). Therefore the fractalnature of noisy, wideband bearing vibration data can

also provide useful information on energy redistributionfrom low to high frequencies. The wavelet variancecharacteristics have been used in this context forstructural damage detection based on machine learning(Staszewski, 2000). Although statistical patternrecognition has been used in this approach, variousother methodologies – based on soft computing tech-niques – can be applied, as reviewed by Worden et al.(2011).

The major objective of this paper is to explore theself-similarity of bearing vibration data for noveltydetection based on artificial neural networks. Thewavelet variance characteristics – based on the orthog-onal wavelet transform and used for feature extractionand compression – are combined with a novelty detec-tor for fault detection in bearings. The paper is orga-nized as follows. Section 2 introduces the basic conceptbehind wavelet variance analysis. The novelty detectionalgorithm is described in Section 3. Section 4 formu-lates the procedure of the proposed method. Theresults from simulated and real bearing data aregiven in Section 5. Finally, the paper is concluded inSection 6.

2. Wavelet analysis

This section very briefly introduces the ideas of multi-resolution analysis. Although the background theory iswell known, the material presented is necessary tointroduce the wavelet-based fractal analysis given inthe next section.

It is commonly known that the wavelet transformcan be treated as an extension of the traditionalFourier transform with an adjustable window locationand size. This transform decomposes the signal xðtÞ intoa sum of elementary functions abðtÞ – obtained from aso-called mother wavelet ðtÞ by the operations of scal-ing a and translation b, i.e.

a, bðtÞ ¼ t� b

a

� �ð1Þ

The admissibility condition of the elementary function ðtÞ needs to be satisfied to ensure the inverse trans-form. Two basic transformations can be distinguished,i.e. the continuous wavelet transform and the discretewavelet transform. In general, the former is used fortime–frequency analysis and the latter is more suitablefor decomposition, compression and feature selection(Staszewsli, 2000). The continuous Grossman-Morletwavelet transform can be defined as

Wa,b½x tð Þ� ¼1ffiffiffiap

Z 1�1

x tð Þ �a, bðtÞdt ð2Þ

2 Journal of Vibration and Control

at University of Sheffield on June 12, 2015jvc.sagepub.comDownloaded from

XML Template (2014) [15.5.2014–10:21am] [1–16]//blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/JVCJ/Vol00000/140079/APPFile/SG-JVCJ140079.3d (JVC) [PREPRINTER stage]

where � complex conjugate of and 1/ffiffiffiap

ensures theenergy independence of the parameter for each wavelet.The transition from the continuous to discrete waveletcan be made using the discretization a ¼ am0 ,b ¼ nan0b0,where m, n are integers and the values a0, b0 are differ-ent from zero. A further limitation to these parameters,i.e. a0 ¼ 2 and b0 ¼ 1, leads to the dyadic wavelet trans-form. Both the continuous and discrete transformshave redundancy. The solution to this problem can beoffered via the orthogonal wavelet transform. The set ofwavelets is called orthogonal, if for the wavelet func-tions defined as

m,k tð Þ ¼ 2m2 2mt� kð Þ, m, k 2 Z ð3Þ

the following product is obtained:

5 m,k, n,l 4 ¼ �mn�kl ð4Þ

where �mn is the Kronecker symbol which is equal to 1 ifm¼ n and to zero otherwise.

The wavelet transform can be interpreted as a filterbank. To describe this approach two basic equationsneed to be introduced, namely the dilation and scalingequations given respectively as

� tð Þ ¼XNk¼0

hðkÞffiffiffi2p�ð2t� kÞ ð5Þ

tð Þ ¼XNk¼0

gðkÞffiffiffi2p�ð2t� kÞ ð6Þ

where hðkÞ and gðkÞ are coefficients of the low-pass andhigh-pass filters respectively. The signal xðtÞ can bedecomposed into an infinite number of details at theinfinite resolution levels, i.e.

x tð Þ ¼Xm

Xn

xmn mn ðtÞ ð7Þ

However, this decomposition is impractical.Alternatively, the analysis can stop at any pre-definedlevel M and the signal can be reconstructed as a sum ofapproximation at level M and a sum of details at levelM and lower levels. This can be expressed as

x tð Þ ¼X1

n¼�1

aMn �Mn tð Þ þ

XMm¼1

X1n¼�1

xmn mn ðtÞ ð8Þ

where

amn ¼

Zx tð Þ2

m2��ð2mt� nÞdt ð9Þ

xmn ¼

Zx tð Þ2

m2 �ð2mt� nÞdt ð10Þ

and amn , xmn are the coefficients of the relevant approxi-

mations and details. By the substitutions of l ¼ 2nþ kand t ¼ 2mt� n in equations (5) and (6) and imple-mentation of these equations into equations (9) and(10) the following relations can be obtained (Mallat,1998):

amn ¼Xl

hðl� 2nÞamþ1l ð11Þ

xmn ¼Xl

g l� 2nð Þamþ1l ð12Þ

These equations imply that the approximationand detail coefficients at level m can be obtainedrecursively via the filter-downsample algorithm ofapproximation coefficients at level mþ 1. Thisapproach is commonly known as multiresolution ana-lysis. The entire procedure can be illustrated as inFigure 1.

The filter coefficients and scaling functions must sat-isfy certain conditions which are discussed in Mallat

Figure 1. Multiresolution wavelet decomposition.

Ziaja et al. 3

at University of Sheffield on June 12, 2015jvc.sagepub.comDownloaded from

XML Template (2014) [15.5.2014–10:21am] [1–16]//blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/JVCJ/Vol00000/140079/APPFile/SG-JVCJ140079.3d (JVC) [PREPRINTER stage]

(1998); for an orthogonal set of wavelets these condi-tions lead to the following set of equations:

XN�1n¼0

h nð Þ ¼ffiffiffi2p

XN�1n¼0

h nð Þh n� 2kð Þ ¼ � kð Þ, for k ¼ 0, 1, 2, . . .N

2� 1

8>>>><>>>>:

ð13Þ

If N¼ 2, the coefficients 1ffiffi2p , 1ffiffi

2p

n ofor the well-known

2th order Daubechies wavelet are obtained. In numer-

ical implementation the filter coefficients can be pre-

sented in the form of matrixes HF and GF called

quadrature mirror filters. The HF matrix is the smooth-

ing filter and the GF matrix is a detail filter. The detailed

explanation can be found in Press et al. (1992).

2.1. Wavelets and fractals

Many physical signals – such as geophysical records oreconomic time series –show an invariance property toscale rather than translation. One of the most import-ant groups of such signals is formed by self-similarrandom processes known as 1=f processes. A randomprocess xðtÞ is statistically self-similar if for any reala4 0 the scaling relation is obeyed, i.e.

mi x tð Þ½ � ¼ mi a�HxðatÞ

� �ð14Þ

where H is a constant – the self-similarity parameter –and mi are the statistical moments. In practice self-simi-larity means that signals are embedded within them-selves. The self-similarity parameter H for a 1=fprocess depends only on a spectral parameter �, char-acteristic for the given process. For instance, processescorresponding to 15 �5 3, which exhibit infinite low-frequency power, are known as the fractional Brownianmotions. The classical Brownian motion is a specialcase for � ¼ 2: Processes corresponding to�15 �5 1 show infinite high-frequency power andare termed fractional Gaussian noises. The classical sta-tionary Gaussian white noise is a special case with� ¼ 0: It can be demonstrated that in fact the relationbetween self-similarity parameter and spectral param-eter is � ¼ 2Hþ 1 (Wornell, 1993). The sampling gridof a 1=f process is a fractal, therefore there is a strongrelationship between the fractal dimension D and theself-similarity parameter H of such a process.

The wavelet transform and multiresolution analysishave emerged as good tools for studying theself-similarity of signals, due to their ability to examinea signal at different scales. If the signal xðtÞ satisfies the

self-similarity property – the statistics are invariant todilations and compressions of xðtÞ, i.e.

x tð Þ ¼ ��x½� t� rð Þ� ð15Þ

for some parameters �, � and r. It follows from equa-tion (2) that

Wa,b x tð Þ½ � ¼ ��Wa,b �a,� t� rð Þð Þ ð16Þ

This shows that self-similarity of a signal xðtÞ impliesself-similarity of its wavelet transform in the time-scaledomain. As a consequence for 1=f processes the vari-ance of the orthogonal wavelet coefficients xmn of thesignal xðtÞ is of the form (Wornell, 1993):

varxmn ¼ �22��m ð17Þ

where �2 is defined as

�2 ¼�2x2�

Z þ1�1

j�ð!Þj2

j!j�d! ð18Þ

and �ð!Þ is the Fourier transform of the mother wave-let . Equation (18) shows that the parameter �2 is aconstant depending on the selected wavelet. Equation(17) can be further rewritten to describe the dependenceof wavelet coefficients xmn on the wavelet level m in theform of

log2 var xmn ¼ ��mþ b ð19Þ

where b ¼ 2 log2 � is a constant. The consequence ofthis equation is that the logarithm of the variance ofwavelet coefficients plotted against the wavelet levelsresults for 1/f processes is a straight line of slopeequal to the negated spectral parameter � (Wornell,1993). Figure 2 demonstrates the results of the dis-cussed dependencies using an example of numericallysimulated Gaussian white noise signal.

The investigation of the detail coefficients inFigure 2(a) shows that the variations of the results fordifferent decomposition levels are distributed from –1to 1. From the corresponding wavelet variance plot thespectral parameter � can be evaluated as equal to zero.It should be noted that some fluctuation in the plot canbe observed at lower values of decomposition (i.e. forlower values of m) due to the fact that levels are com-posed of 2m wavelet basis functions. Hence, the gradi-ent should be estimated using values from higher levelsonly (Wornell, 1993).

The wavelet-based self-similarity analysis and vari-ance characteristics given by equation (19) have notonly gained an interest for analysis of self-similar sig-nals (Flandrin, 1999), coherent structures (Staszewski

4 Journal of Vibration and Control

at University of Sheffield on June 12, 2015jvc.sagepub.comDownloaded from

XML Template (2014) [15.5.2014–10:21am] [1–16]//blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/JVCJ/Vol00000/140079/APPFile/SG-JVCJ140079.3d (JVC) [PREPRINTER stage]

and Worden, 1999) and gravity wave events in atmos-pheric layers (Rees et al., 2001), but also in damagedetection applications (Staszewski, 2000) where differ-ent energy-concentration features (e.g. signal impulses)and energy distribution features have been analyzed.

The latter application does not involve self-similarityanalysis in a strict sense; the wavelet-based variancecharacteristic is used to represent signal energy distri-bution. Higher values of variance at given scales cor-respond to the presence of a greater number of peaks

Figure 2. Self-similarity analysis of the Gaussian white noise: (a) Gaussian white noise; and example wavelet decomposition levels;

(b) wavelet coefficient variance plot.

Ziaja et al. 5

at University of Sheffield on June 12, 2015jvc.sagepub.comDownloaded from

XML Template (2014) [15.5.2014–10:21am] [1–16]//blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/JVCJ/Vol00000/140079/APPFile/SG-JVCJ140079.3d (JVC) [PREPRINTER stage]

and greater intensity of signal or both, as pointed out inBradshaw and Spies (1992). For vibration signals in thecurrent investigations this observation corresponds tomodulations of the high-frequency carrier with charac-teristic damage frequencies. Figure 3 gives an exampleof a bandpass noise modulated with a sine wave of10Hz with a DC component of 2.

The wavelet variance plots for this signal are given inFigure 3(c) and 3(d) as solid lines. For comparison thesame characteristics for the bandpass noise are pre-sented using dashed lines. The visible peaks at level16 correspond to the 3�6 kHz frequency range noisesignal. The increased amplitude of the peak is observedfor the modulated signal. In the logarithmic scale thetranslation of the plot for all decomposition levels isvisible.

3. Novelty detection based onauto-associative neural networks

The novelty detection task is to determine whether afeature measured continuously is consistent with

previous measurements for the analyzed system. In con-dition monitoring this would refer to a comparison of acurrent (possibly faulty) condition with a template (orreference) representing an unfaulty condition. Thereexist several different methods of novelty detection,which can be generally divided into statistical- andneural network-based approaches. Statistical methodsare mostly model-based approaches that rely on statis-tical properties of analyzed data. Neural networks areoften used for condition monitoring applications toallow for automated fault detection. Probably the firstapplications of a combined wavelet-neural approachcan be dated back to the mid 1990s (Staszewski andWorden, 1997).

A more recent approach – based on auto-associativeneural networks (AANN) – has been proposed inWorden (1997). A multi layer perceptron (MLP)neural network is trained to reproduce the input patternon the output in this approach. The MLP is a feed-forward network. The signal is passed into the inputlayer, then progresses through the hidden layer and theresults appear at the output layer. Each node from the

Figure 3. Self-similarity analysis of the modulated bandpass noise: (a) original signal; (b) power spectrum; (c) wavelet coefficient

variance plot – linear scale; (d) wavelet coefficient variance plot.

6 Journal of Vibration and Control

at University of Sheffield on June 12, 2015jvc.sagepub.comDownloaded from

XML Template (2014) [15.5.2014–10:21am] [1–16]//blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/JVCJ/Vol00000/140079/APPFile/SG-JVCJ140079.3d (JVC) [PREPRINTER stage]

previous layer is connected to each from the followinglayer through a connection of weight. The neuron outputcan take any value in the range from �1 to 1. The‘‘bottleneck’’ structure of the network is used in orderto ensure that the task of the input-to-output reproduc-tion is nontrivial. Figure 4 illustrates a typical ‘‘bottle-neck’’ structure of the AANN.

Here, patterns are passed through hidden layerswhich have fewer nodes than the input layer, thereforethe network is forced to learn the significant features ofpatterns. During the training stage, appropriate valuesfor the connection weights are established based on theknown inputs and outputs (supervised learning). Foreach step a set of inputs is passed through the networkproducing trial output values, which are further com-pared with the desired results. In the case of a signifi-cant error, the error is passed backward through thenetwork and the connection weights are adjusted. Thedescribed algorithm is usually referred to as ‘‘backpro-pagation’’. For novelty detection purposes during thetraining process only the datasets from the ‘‘no fault’’structure are used. The network is thus trained to beable to correctly reconstruct the ‘‘no fault’’ patterns.The data with fault signs will cause an increase of the

error of the network and thus indicate abnormality,which is the basic idea of this method.

The Euclidean distance between the pattern z andthe result from neural network z can be used as thenovelty index v(z), i.e.

v zð Þ ¼ z� z ð20Þ

If the new data pattern is typical of a normal conditionit is reproduced accurately by the network, thus thedistance is close to zero, otherwise a significant valueis obtained. Note that there is no guarantee that thisvalue will increase monotonically with the level ofdamage. A threshold can be used to flag a faulty con-dition. This threshold can be defined as

T ¼ þ ��, ð21Þ

where is the mean value of the novelty index obtainedfrom the training set and � is the standard deviation forthis dataset. The factor � corresponds to the confidencelevel for normality. A comprehensive introduction tothe theory of natural computing can be found inWorden et al. (2011). Previous applications ofAANN-based novelty detectors are very encouraging,and therefore this approach has been used in the cur-rent investigations.

4. Fault detection procedure

The wavelet variance characteristics and novelty detec-tion described in Sections 2 and 3 respectively wereused for fault detection in rolling element bearings.The entire fault detection procedure is illustrated inFigure 5 using four different steps.

The orthogonal wavelet transform is applied tosignal data in the first step. The fourth- orderDaubechies wavelets were used in this analysis becauseof their previous successful application history(Staszewski et al., 1999; Staszewski, 2000). As a resultthe original signal is decomposed into a finite numberof wavelet levels, where the given level m correspondsto the detail signal at scale 2m. For the wavelet analysisalgorithm implementation guidelines the reader isreferred to Press et al. (1992).

In the next step, the statistical variance of the esti-mated coefficients xmn is calculated for each decompositionlevel. At this point, the wavelet coefficient variance plotcan be produced. Then, the novelty detection algorithm isapplied. As an input to the MSP neural network the loga-rithm to the base of 2 of the obtained wavelet coefficientsis used as an input to the MLP network. The networkused was a three-layer feed-forward network with sigmoidhidden neurons and linear output neurons. The data wasdivided into training, validation and testing sets with the

Figure 4. ‘Bottleneck’ structure of the neural network used for

novelty detection.

Ziaja et al. 7

at University of Sheffield on June 12, 2015jvc.sagepub.comDownloaded from

XML Template (2014) [15.5.2014–10:21am] [1–16]//blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/JVCJ/Vol00000/140079/APPFile/SG-JVCJ140079.3d (JVC) [PREPRINTER stage]

proportion 70:15:15. For the network structure, the inputand output layers consisted of 10 neurons, one for eachvariance value at given level. A single hidden layer withfive neurons was created to obtain the bottleneck archi-tecture and this number was also selected using a trial anderror approach. At the stage of training, validation andtesting reference datasets represented the ‘‘no fault’’ con-dition. The neural network was then tested with datafrom various conditions.

In the final step of the procedure, the noveltyindexes were calculated using the Euclidean distance.Additionally the threshold was calculated for the pos-sibility of automatic implementation.

5. Experimental fault detection results

This section presents fault detection results from rollingelement bearings. Experimental data from laboratoryand real industrial conditions are used in this investiga-tion. The results are compared with the classical ana-lysis used in condition monitoring.

5.1. Data from laboratory test rig

A simple laboratory experiment was conducted in orderto obtain controlled experimental data. The laboratorytest rig comprised two bearings, an AC motor with agear-reductor of ratio 2.8, a jaw clutch and an electricpower generator. The tested component was the SKFEKTN9 double-row ball bearing. The rotational speedof the shaft was set via an inverter to a constant valuecorresponding to 45Hz. Vibration responses were mea-sured by an accelerometer mounted at the housing ofthe faulty bearing. Measurements in vertical directionswere taken. The experimental arrangements used arepresented schematically in Figure 6, where the investi-gated bearing is highlighted in gray.

The outer race of the investigated bearing was inten-tionally damaged for the tests by drilling a groove, whichin the largest case was around 1mm in width and depth.Figure 7 presents the introduced damage for the mostsevere case investigated. The characteristic frequency ofthe outer race defect for the described setup and speedwas equal to 233Hz. The experiment was performed forfour different loading conditions. The motor was oper-ating without additional load and then loaded by thegenerator producing 0.1, 0.2 and 0.3 kW of electricalpower. For each damage and load case 10 seconds ofsignal were recorded at a sampling frequency of 25 kHz.

Figure 8 gives an example of the signal recorded forthe largest groove introduced. A clear pattern of high-amplitude repeated peaks can be clearly observed. Thefrequency of these patterns corresponds to the charac-teristic frequency of the outer-race fault (BPFO).

The fault strongly excites the resonance frequenciesof the monitored element and modulates the entiresignal. A classical envelope analysis was used todetect the fault for various fault advancements.Figure 9 presents the obtained results for the no add-itional load condition. It is important to note that theamplitude scale for the most advanced fault (damage 4in Figure 9(d)) is one order of magnitude larger thanfor the other advancements investigated.

One more example illustrating the effect of load isgiven in Figure 10. For the most severe fault investi-gated (Figure 9(d)) the envelope spectrum shows a sig-nificant peak at the characteristic frequency of thebearing fault and its harmonics. Due to the additionalload the amplitude values of the relevant peaks werereduced by a factor of almost two when Figure 10 wasanalyzed.

Vibration responses gathered from the rolling elem-ent bearings were divided into 2.5-second long signalsto obtain 140 datasets for further analysis. The wavelet

Figure 5. Schematic diagram of the fault detection procedure based on wavelet variance and novelty detector.

8 Journal of Vibration and Control

at University of Sheffield on June 12, 2015jvc.sagepub.comDownloaded from

XML Template (2014) [15.5.2014–10:21am] [1–16]//blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/JVCJ/Vol00000/140079/APPFile/SG-JVCJ140079.3d (JVC) [PREPRINTER stage]

variance characteristics were calculated for all analyzed2.5-second datasets. Figure 11 gives exemplary waveletvariance characteristics for different fault conditions.Changes in signal energies are reflected in the waveletcoefficient variance plots by amplitude increases for cer-tain wavelet levels. For example a small peak can beobserved in the case of damage 2 for wavelet levels11–12, corresponding to the range of 390–780Hz.

More detailed analysis of this range in the spectrumplot confirmed that frequencies 550–800Hz wereexcited more strongly and modulated with the BPFO.In the case of the largest damage investigated, the influ-ence of both spectral components, i.e. the rotational

components (low frequency vibration) and the modu-lations (high frequencies) are visible. The high variancevalues for small wavelet levels (i.e. up to the 6th level)should not be interpreted as a fault indicator; thesevalues result from some non-periodic componentsappearing at time instants equal to 0.7, 1.8 and 2second. Figure 12 gives 10 examples of wavelet variancecharacteristics from a fixed condition to demonstratedata variability.

Finally, the MLP network was trained, validatedand tested using all the analyzed wavelet variance char-acteristics. The test results are presented in Figure 13,where 12 sets are presented for each group representingdifferent damage conditions. Within each group of datathe first three sets are given for the unloaded conditionand then each three consecutive bars represent the 0.1,0.2 and 0.3 kW loads.

The results show that for all introduced damagecases the novelty index parameter increases. The hori-zontal line indicates the threshold value calculated forthe undamaged dataset. The largest violation of thethreshold was achieved for the most severe damagecondition (left part of Figure 13), however, alldamage conditions are clearly detected. The worstresults in the context of damage severity assessmentwere obtained for the damage 1 and damage 2 cases;particularly when the system was unloaded.

5.2. Wind turbine case study

Although many different types of bearing faults can beanalyzed in practice, the most important task is always

Figure 6. Photograph and a schematic diagram of the laboratory test rig used for fault detection in rolling element bearings.

Figure 7. Seeded fault introduced to the tested bearing – the

last stage of fault advancement.

Ziaja et al. 9

at University of Sheffield on June 12, 2015jvc.sagepub.comDownloaded from

XML Template (2014) [15.5.2014–10:21am] [1–16]//blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/JVCJ/Vol00000/140079/APPFile/SG-JVCJ140079.3d (JVC) [PREPRINTER stage]

Figure 9. Envelope spectra for various seeded bearing faults: (a) ‘‘no fault’’ condition; (b) fault advancement 1; (c) fault advancement

2; (d) fault advancement 3.

Figure 8. Bearing vibration data for the last stage of seeded fault.

10 Journal of Vibration and Control

at University of Sheffield on June 12, 2015jvc.sagepub.comDownloaded from

XML Template (2014) [15.5.2014–10:21am] [1–16]//blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/JVCJ/Vol00000/140079/APPFile/SG-JVCJ140079.3d (JVC) [PREPRINTER stage]

to detect these faults at the earliest stage in order toavoid damage of further elements of the kinematicchain. In wind turbine structures, the detection offaults in rolling element bearings is difficult becauseof variations in loads and rotational speeds.

The proposedmethod has been tested using data froma typical wind turbine of a nominal 1.5MW power.

The kinematic chain of this wind turbine can bedescribed as follows. The main rotor was supported bya main bearing. The torque was transmitted from therotor via the shaft to the planetary gearbox and furtherto the parallel gearbox. From the three-stage parallelgear the torque drove a generator, producing AC cur-rent. Two bearings were mounted in the generator. Avibration sensor was attached to the housing of the

Figure 10. Envelope spectra for the fault advancement 3: (a) unloaded test rig; (b) maximum load used.

Figure 11. Wavelet coefficient variance plots for laboratory

data representing different fault conditions.

Figure 12. Wavelet coefficient variance plots for 10 different

sets of data from laboratory tests. The plots demonstrate data

variability.

Ziaja et al. 11

at University of Sheffield on June 12, 2015jvc.sagepub.comDownloaded from

XML Template (2014) [15.5.2014–10:21am] [1–16]//blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/JVCJ/Vol00000/140079/APPFile/SG-JVCJ140079.3d (JVC) [PREPRINTER stage]

generator, in the vicinity of the bearings. Vibration datawere acquired by a condition monitoring system per-manently attached to the structure. The data were rec-orded over a seven-month period. The measured signalswere recorded for the speed variations of the generatorshaft in the range of 1073–1090 rpm. The sampling fre-quency of the data was equal to 25 kHz. In one of thegenerator bearings a fault was developed during thisperiod.

Figure 14 gives root mean square (green plot) andpeak-to-peak amplitude (blue plot) values calculatedonline for measured data over the monitored period.The data have been normalized according to the max-imum expected value, i.e. 100% corresponds to peak-topeak value of 52:8m=s2 and root mean square of26:4m=s2. The variation of the operation conditionsare clearly reflected by the results, the fluctuations inspeed measurements are marked in ‘‘stars’’.

The inner-race fault has developed in one of themonitored bearings. This event is indicated by the ver-tical dashed arrow in Figure 14. Although, the trendson the right-hand side (peak-to-peak values in particu-lar) of the Figure 14 clearly indicate fault, theoccurrence of the fault is not so clear. Measured vibra-tion datasets were divided into 2.5-second signalsto obtain the database for the novelty detection ana-lysis. Firstly wavelet variance characteristics werecalculated, and Figure 15 gives examples of thesecharacteristics.

The results obtained for the sixth signal representthe ‘‘no fault’’ condition, whereas the results for theeighth and the 13th signal indicate the ‘‘fault’’ condi-tions. In Figure 14, these integer numbers of signals

indicate when the data were recorded. The waveletvariance characteristics in Figure 15 clearly exhibitfrequency regions where resonance frequencies aregenerated. This is particularly visible between the sev-enth and 14th wavelet levels. It can be observed thatwhen the fault develops the largest peak values areobtained for the 12th wavelet level. This is related tostrong modulations (see the example provided inSection 2). The analyzed signals are modulated bythe rotational speed signal of the generator shaftdue to the fault. For comparison, the frequency andenvelope spectra for the same signals are presented inFigures 16 and 17. The results show that the enve-lope spectra detect the fault (compare Figure 17(a)with 17(b) and (c)) but do not distinguish betweenvarious fault advancements (compare Figures 17(b)and 17(c)).

Wavelet variance characteristics were used to formthe novelty indicator. The MLP novelty detector wastrained, validated and tested using the same parametersas in the previous section. The network was trainedusing the ‘‘no fault’’ condition data. The results fromthe novelty detection tests are given in Figure 18. Thefirst 28 sets of data on the left-hand side represent the‘‘no-fault’’ condition (number 1 in Figure 14 indicateswhen the data were recorded) whereas the other 28 setsof data represent the largest fault advancement (signal13 in Figure 14 indicates when the data were recorded).The results in Figure 18 show that these two differentfault conditions are clearly separated. In the next step14 measurements – recorded (14 integer numbers inFigure 14 indicate when these measurements weretaken) over the entire monitored period – were

Figure 13. Fault detection results based on the novelty detection analysis for the experimental laboratory test rig data.

12 Journal of Vibration and Control

at University of Sheffield on June 12, 2015jvc.sagepub.comDownloaded from

XML Template (2014) [15.5.2014–10:21am] [1–16]//blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/JVCJ/Vol00000/140079/APPFile/SG-JVCJ140079.3d (JVC) [PREPRINTER stage]

analyzed. The novelty detection results are presented inFigure 19.

Again, the separation of the ‘‘no fault’’ and ‘‘fault’’conditions is clearly visible. The fault is flagged by theindicated threshold level by the eighth measurement.What is more interesting is that the next four measure-ments exhibit increased values of the Euclidean dis-tance, probably due to the deteriorating condition ofthe bearing.

6. Conclusions

This paper presents a signal processing tool for auto-matic fault detection in rolling element bearings. Theproposed methodology originates from fractal theory

and a novelty detection approach. The wavelet-basedvariance – originally developed for self-similarity ana-lysis – and neural networks are used for fault detection.The former is used as a pre-processor for aneural-network-based novelty detector. The method istested using simulated and experimental examples.The latter involves vibration data from laboratoryexperiments and real wind turbine measurementsfrom a commercial monitoring system.

The wavelet variance results show that long vibrationmeasurements from bearings can be characterized by ashort set of numbers which are able to represent thefrequency content of the analyzed signal, being sensitiveto modulations, impulsivity (in terms of spikes) andintensity (in terms of increased amplitudes) of the

Figure 15. Examples of wavelet variance characteristics for the wind turbine vibration data.

Figure 14. Wind turbine vibration measurement normalised trends for the monitored period of time. (Normalisation according to

expected range.) The vertical dashed arrow indicates when the bearing fault was developed.

Ziaja et al. 13

at University of Sheffield on June 12, 2015jvc.sagepub.comDownloaded from

XML Template (2014) [15.5.2014–10:21am] [1–16]//blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/JVCJ/Vol00000/140079/APPFile/SG-JVCJ140079.3d (JVC) [PREPRINTER stage]

Figure 17. Examples of envelope spectrum for the wind turbine vibration data. (a) signal 6; (b) signal 8; (c) signal 13.

Figure 16. Examples of spectrum for the wind turbine vibration data. (a) signal 6; (b) signal 8; (c) signal 13.

14 Journal of Vibration and Control

at University of Sheffield on June 12, 2015jvc.sagepub.comDownloaded from

XML Template (2014) [15.5.2014–10:21am] [1–16]//blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/JVCJ/Vol00000/140079/APPFile/SG-JVCJ140079.3d (JVC) [PREPRINTER stage]

signal. This property of the wavelet variance could bevery useful for bearing condition monitoring becausebearing defects exhibit localized features that are oftenembedded in the noise across wide frequency bands.

The seeded (laboratory) and real (industrial) bearingfaults have been clearly detected. Some evidence thatthe method could assess fault advancement has beenfound when the wind turbine vibration measurements

were investigated. However, this needs furtherinvestigation.

The fact that wavelet variance characteristics are ofrelatively low dimension is also important for neuralnetwork analysis, where a large amount of data is usu-ally required for training. The work demonstrates thatwavelet variances can be used directly as inputs to thenetwork. It is also important to note that the noveltydetection approach used does not require training withdata representing various fault conditions.

Further work is required to confirm these findings.Future work should involve more rigorous studies thatcompare the proposed methodology with other sig-nal processing approaches. Also, more industrial casestudies – involving various types of bearing faults andfaults severities/advancements (i.e. fault sensitivity) –should be investigated.

Funding

This work was supported by the Welcome research project(grant number 2010-3/2) sponsored by the Foundation forPolish Science (Innovative Economy, National Cohesion

Programme, EU).

References

Barszcz T and Jablonski A (2011) A novel method for the

optimal band selection for vibration signal demodulation

and comparison with the kurtogram. Mechanical Systems

and Signal Processing 25(1): 431–451.

Bradshaw GA and Spies TA (1992) Characterizing canopy

gap structure in forest using wavelet analysis. The

Journal of Ecology 80(2): 205–215.Brown RD, Addison P and Chan AHC (1994) Chaos in the

unbalance response of journal bearings. Nonlinear

Dynamics 5(4): 421–432.Carden EP and Fanning P (2004) Vibration based condition

monitoring: a review. Structural Health Monitoring 3(4):

355–377.Chiementin X, Bolaers F, Cousinard O and

Rasolofondraibe L (2008) Early detection of rolling

bearing defect by demodulation of vibration signal

using adapted wavelet. Journal of Vibration and

Control 14(11): 1675–1790.Craig C, Neilson RD and Penman J (2000) The use of cor-

relation dimension in condition monitoring of systems

with clearance. Journal of Sound and Vibration 231(1):

1–17.Fakhfakh T, Bartelmus W, Chaari F, Zimroz R and Haddar

M (2012) Condition Monitoring of Machinery in

Non-stationary Operations. Heidelberg: Springer.Flandrin P (1999) Time-frequency/Time-scale Analysis. San

Diego: Academic Press.

Forrester BD (1992) Time-frequency analysis in machine fault

detection. In: Boashash B (ed.) Time-Frequency Signal

Analysis. Cheshire: Longman.Li B, Zhang PL, Mi SS, Zhang YT and Liu DS (2012)

Bearing fault detection using multi-scale fractal

Figure 19. Fault detection results based on the novelty

detection analysis for the wind turbine data; the analysed data

represent 14 consecutive measurements over the entire moni-

tored period.

Figure 18. Fault detection results based on the novelty

detection analysis for the wind turbine data; the ‘‘no fault’’

condition (left-hand side) and the largest fault advancement

condition (right-hand side) are investigated.

Ziaja et al. 15

at University of Sheffield on June 12, 2015jvc.sagepub.comDownloaded from

XML Template (2014) [15.5.2014–10:21am] [1–16]//blrnas3/cenpro/ApplicationFiles/Journals/SAGE/3B2/JVCJ/Vol00000/140079/APPFile/SG-JVCJ140079.3d (JVC) [PREPRINTER stage]

dimensions based on morphological covers. Shock andVibration 19(6): 1373–1383.

Li F, Meng G, Ye L and Chen P (2008) Wavelet transform-

based higher-order statistics for fault diagnosis in rollingelement bearings. Journal of Vibration and Control 14(11):1691–1709.

Logan D and Mathew J (1996) Using the correlation

dimension for vibration fault diagnosis of rolling elementbearings – I. Basic concepts. Mechanical Systems andSignal Processing 10(3): 241–250.

Lou X, Loparo KA, Discenzo FM, Yoo J and Twarowski A(2004) Bearing fault diagnosis based in wavelet transformand fuzzy inference. Mechanical Systems and Signal

Processing 18(5): 1077–1095.Luo GY, Osypiw D and Irle M (2003) On-line vibration ana-

lysis with fast continuous wavelet algorithm for condition

monitoring of bearing. Journal of Vibration and Control9(8): 931–947.

McFadden PD and Smith JD (1984a) Model of the vibra-tion produced by a single point defect in a rolling

element bearing. Journal of Sound and Vibration 96(1):69–82.

McFadden PD and Smith JD (1984b) Vibration monitoring

of rolling element bearings by the high-frequency reson-ance technique – a review. [Original research article]Tribology International 17(1): 3–10.

McFadden PD and Wang WJ (1991) Time-frequency domainanalysis of vibration signals for machinery diagnostics. (I)Introduction to the Wigner-Ville distribution. Ouel report1859/90. Oxford: Oxford University.

McFadden PD and Wang WJ (1996) Application of waveletsto gearbox vibration signals for fault detection. Journal ofSound and Vibration 192(5): 927–939.

Mallat S (1998) A Wavelet Tour of Signal Processing. SanDiego: Academic Press.

Mori K, Kasashima N, Yoshioka T and Ueno Y (1996)

Prediction of spalling on a ball bearing by applying thediscrete wavelet transform to vibration signals. Wear195(1): 162–168.

Nelwamondo DV, Marwala T and Mahola U (2006) Earlyclassification of bearing faults using hidden Markovmodels, Gaussian mixture models, mel-frequency cepstralcoefficients and fractals. International Journal of

Innovative Computing, Information and Control 2(6):1281–1299.

Press WH, Teukolsky SA, Vetterling WT and Flannery BP

(1992) Numerical Recipes in C. The Art of ScientificComputing, 2nd edn. New York: Cambridge UniversityPress.

Randall RB (2011) Vibration-based Condition Monitoring.UK: Wiley.

Randall RB and Antoni J (2011) Rolling element bearingdiagnostics – a tutorial. Mechanical Systems and SignalProcessing 25(2): 485–520.

Rees JM, Staszewski WJ and Winkler JR (2001) Case studyof a wave event in the stable atmospheric boundary layeroverlaying an Antarctic ice shelf using the orthogonalwavelet transform. Dynamics of Atmospheres and Oceans

32(2–4): 245–261.Staszewski WJ (2000) Wavelets for mechanical and structural

damage identification. In: Studia i Materialy, Monograph

No. 510/1469/2000, Institute of Fluid-Flow Machinery,Polish Academy of Sciences, Gdansk.

Staszewski WJ and Tomlinson GR (1993) Time-variant meth-

ods in machinery diagnostics. In: Tomlinson GR, NatkeHG and Yao JTP (eds) Safety Evaluation Based onIdentification Approaches Related to Time-Variant and

Nonlinear Structures. Braunschweig/Wisbaden: Vieweg.Staszewski WJ and Tomlinson GR (1994) Application of the

wavelet transform to fault detection in a spur gear.Mechanical System and Signal Processing 8(3): 289–307.

Staszewski WJ and Worden K (1997) Classification of faultsin gearboxes – pre-processing algorithms and neural net-works. Neural Computing & Applications 5(3): 160–183.

Staszewski WJ and Worden K (1999) Wavelet analysis oftime-series: coherent structures, chaos and noise.International Journal of Bifurcation and Chaos 9(3):

455–471.Staszewski WJ, Ruotolo R and Storer D (1999) Fault detec-

tion in ball bearings using wavelet variance. In Proceedingsof 17th International Modal Analysis Conference,

Kissimmee, Florida, 8–11 February 1999, pp. 1335–1339.Staszewski WJ, Worden K and Tomlinson GR (1997) Time-

frequency analysis in gearbox fault detection using

Wigner-Ville distribution and pattern recognition.Mechanical Systems and Signal Processing 11(5): 673–692.

Wang YF and Kootsookos PJ (1998) Modeling of low shaft

speed bearing faults for condition monitoring. MechanicalSystems and Signal Processing 12(3): 415–426.

Worden K (1997) Structural fault detection using a novelty

measure. Journal of Sound and Vibration 201(1): 185–101.Worden K, Staszewski WJ and Hensman JJ (2011) Natural

computing for mechanical systems research: a tutorialoverview. Mechanical Systems and Signal Processing

25(1): 4–111.Wornell GW (1993) Wavelet-based representations for the 1/f

family of fractal processes. In Proceedings of the IEEE

81(10): 1428–1450.Yang J, Zhang Y and Zhu Y (2007) Intelligent fault diagnosis

of rolling element bearing based on SVMs and fractal

dimension. Mechanical Systems and Signal Processing21(5): 2012–2024.

16 Journal of Vibration and Control

at University of Sheffield on June 12, 2015jvc.sagepub.comDownloaded from