Ensemble Based Real-Time Adaptive Classification System for Intelligent Sensing Machine Diagnostics

11
IEEE TRANSACTIONS ON RELIABILITY, VOL. 61, NO. 2, JUNE 2012 303 Ensemble Based Real-Time Adaptive Classication System for Intelligent Sensing Machine Diagnostics Minh Nhut Nguyen, Chunyu Bao, Kar Leong Tew, Sintiani Dewi Teddy, Member, IEEE, and Xiao-Li Li, Member, IEEE Abstract—The deployment of a sensor node to manage a group of sensors and collate their readings for system health monitoring is gaining popularity within the manufacturing industry. Such a sensor node is able to perform real-time congurations of the in- dividual sensors that are attached to it. Sensors are capable of ac- quiring data at different sampling frequencies based on the sensing requirements. The different sampling rates affect power consump- tion, sensor lifespan, and the resultant network bandwidth usage due to the data transfer incurred. These settings also have an im- mediate impact on the accuracy of the diagnostics and prognos- tics models that are employed for system health monitoring. In this paper, we propose a novel adaptive classication system architec- ture for system health monitoring that is well suited to accommo- date and take advantage of the variable sampling rate of sensors. As such, our proposed system is able to yield a more effective health monitoring system by reducing the power consumption of the sen- sors, extending the sensors’ lifespan, as well as reducing the resul- tant network trafc and data logging requirements. We also pro- pose an ensemble based learning method to integrate multiple ex- isting classiers with different feature representations, which can achieve signicantly better, stable results compared with the indi- vidual state-of-the-art techniques, especially in the scenario when we have very limited training data. This result is extremely impor- tant in many real-world applications because it is often imprac- tical, if not impossible, to hand-label large amounts of training data. Index Terms—Adaptive classiers, classiers, data driven diag- nostics and prognostics, ensemble learning, sensor data classica- tion. ACRONYMS AND ABBREVIATIONS ACS adaptive classication system CV cross-validations FFT fast Fourier transform KNN k-nearest neighbors KNN for frequency domain KNN for time domain Manuscript received October 27, 2011; revised December 31, 2011; accepted January 26, 2012. Date of publication May 11, 2012; date of current version May 28, 2012. Associate Editor: Q. Miao. M. N. Nguyen, C. Bao, S. D. Teddy, and X.-L. Li are with the Department of Data Mining, Institute for Infocomm Research, Agency for Science, Tech- nology and Research (A STAR), Singapore 138632, Singapore (e-mail: mn- [email protected]; [email protected]; [email protected]; [email protected]). K. L. Tew is currently with SAP Research Centre, Singapore (e-mail: kar. [email protected]). Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/TR.2012.2194352 LIBSVM library for support vector machines MFS Machinery Fault Simulator MLP multilayer perceptron MLP for frequency domain MLP for time domain RMS root mean squared SVM support vector machine SVM for frequency domain SVM for time domain WSN wireless sensor network NOTATION number of models binary value: 1 if model outputs machine state , and 0 otherwise F-measure, or the harmonic mean of precision and recall kurtosis value estimated accuracy of model performed on machine state peak value root-mean-squared value number of machine states nal machine state prediction standard deviation condence score of model performed on machine state I. INTRODUCTION A DVANCEMENTS in wireless sensor network (WSN) technology have allowed a wider adoption and appli- cation of sensory networks. Currently, we are harnessing the capability of such WSN congurations to collect invaluable monitoring data both in the indoor as well as outdoor environ- ments. 0018-9529/$31.00 © 2012 IEEE

Transcript of Ensemble Based Real-Time Adaptive Classification System for Intelligent Sensing Machine Diagnostics

IEEE TRANSACTIONS ON RELIABILITY, VOL. 61, NO. 2, JUNE 2012 303

Ensemble Based Real-Time Adaptive ClassificationSystem for Intelligent Sensing Machine Diagnostics

Minh Nhut Nguyen, Chunyu Bao, Kar Leong Tew, Sintiani Dewi Teddy, Member, IEEE, andXiao-Li Li, Member, IEEE

Abstract—The deployment of a sensor node to manage a groupof sensors and collate their readings for system health monitoringis gaining popularity within the manufacturing industry. Such asensor node is able to perform real-time configurations of the in-dividual sensors that are attached to it. Sensors are capable of ac-quiring data at different sampling frequencies based on the sensingrequirements. The different sampling rates affect power consump-tion, sensor lifespan, and the resultant network bandwidth usagedue to the data transfer incurred. These settings also have an im-mediate impact on the accuracy of the diagnostics and prognos-tics models that are employed for system health monitoring. In thispaper, we propose a novel adaptive classification system architec-ture for system health monitoring that is well suited to accommo-date and take advantage of the variable sampling rate of sensors.As such, our proposed system is able to yield amore effective healthmonitoring system by reducing the power consumption of the sen-sors, extending the sensors’ lifespan, as well as reducing the resul-tant network traffic and data logging requirements. We also pro-pose an ensemble based learning method to integrate multiple ex-isting classifiers with different feature representations, which canachieve significantly better, stable results compared with the indi-vidual state-of-the-art techniques, especially in the scenario whenwe have very limited training data. This result is extremely impor-tant in many real-world applications because it is often imprac-tical, if not impossible, to hand-label large amounts of trainingdata.

Index Terms—Adaptive classifiers, classifiers, data driven diag-nostics and prognostics, ensemble learning, sensor data classifica-tion.

ACRONYMS AND ABBREVIATIONS

ACS adaptive classification system

CV cross-validations

FFT fast Fourier transform

KNN k-nearest neighbors

KNN for frequency domain

KNN for time domain

Manuscript received October 27, 2011; revised December 31, 2011; acceptedJanuary 26, 2012. Date of publicationMay 11, 2012; date of current versionMay28, 2012. Associate Editor: Q. Miao.M. N. Nguyen, C. Bao, S. D. Teddy, and X.-L. Li are with the Department

of Data Mining, Institute for Infocomm Research, Agency for Science, Tech-nology and Research (A STAR), Singapore 138632, Singapore (e-mail: [email protected]; [email protected]; [email protected];[email protected]).K. L. Tew is currently with SAP Research Centre, Singapore (e-mail: kar.

[email protected]).Color versions of one or more of the figures in this paper are available online

at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/TR.2012.2194352

LIBSVM library for support vector machines

MFS Machinery Fault Simulator

MLP multilayer perceptron

MLP for frequency domain

MLP for time domain

RMS root mean squared

SVM support vector machine

SVM for frequency domain

SVM for time domain

WSN wireless sensor network

NOTATION

number of models

binary value: 1 if model outputs machine state, and 0 otherwise

F-measure, or the harmonic mean of precisionand recall

kurtosis value

estimated accuracy of model performed onmachine state

peak value

root-mean-squared value

number of machine states

final machine state prediction

standard deviation

confidence score of model performed onmachine state

I. INTRODUCTION

A DVANCEMENTS in wireless sensor network (WSN)technology have allowed a wider adoption and appli-

cation of sensory networks. Currently, we are harnessing thecapability of such WSN configurations to collect invaluablemonitoring data both in the indoor as well as outdoor environ-ments.

0018-9529/$31.00 © 2012 IEEE

304 IEEE TRANSACTIONS ON RELIABILITY, VOL. 61, NO. 2, JUNE 2012

The indoor environment in this case refers to locations suchas a manufacturing shop floor [1]–[3], a home for the elderly [4],[5], and healthcare environments [6], [7]. In such environments,sensors are deployed, and the readings collected by the sensorsserve as the basis for studying and analyzing the current healthof the system or subject. One good example of such amonitoringscenario would be the large-scale deployment of sensors in themanufacturing shop floor. The data collected in this case con-tain signals and important information about the machine healthstatus and operating conditions. On the contrary, outdoor moni-toring systems frequently involve the collection of outdoor envi-ronment variables like atmospheric readings. Two examples arethe detection of rainfall-induced landslides [8], and avalancheforecasting via the continuous monitoring of the snow coverage[9].The data collected from the sensors provide an entry point

for the prognostics and health management community and datamining community to perform study and in-depth analysis to ob-tain insights into a particular process. The readings produced bythe sensors provide a wealth of information for knowledge ex-traction, modeling, analysis, and intelligent prediction as wellas decision making. There has been a substantial amount of re-search conducted in diagnostics and prognostics based on sensordata [1]–[7], [10]–[14]. In most of these studies, sensors are usu-ally set to sample their readings at a predefined sampling rate.It is important to note that having a constant sampling rate isa prerequisite for most data modeling and analysis methods asit ensures time consistency and similar data distributions in thecollected samples.Our previous research into machine fault diagnostics for the

manufacturing shop floor has led us to the conclusion that ahigher sampling rate would usually yield better diagnostics per-formance [1]. However, in practice, when the machines are run-ning under a healthy state, it is not necessary to collect thesensor readings at such a high sampling rate. More often thannot, higher sampling rates are required only when anomalies orfaults occur during the machine’s operation, to provide the di-agnostics system with enough information for fault isolation,analysis, and identification.Moreover, a higher sampling rate would consequently result

in higher energy consumption by the sensors. Direct correla-tion between the sampling rate and the energy consumption ofsensors has been well established [15]–[18]. In any WSN ar-chitecture (whether indoors or outdoors), sensors are typicallypowered through batteries. This approach in turn constrains thesensor lifespan and capability. It has actually been reported thatthe energy constraint is the main factor preventing the full ex-ploitation of WSN technology [19].One of the directions of WSN research is thus to improve

the power efficiency of the sensors. One common approach isto allow the sampling frequencies to be adaptive as differentlevels of granularity of information are required in different sit-uations [8], [9], [16], [20]. Exploring the use of such an adap-tive sampling rate is a logical approach as it reduces the powerconsumption of the sensors, and thereby extends the lifespanand usability period of the sensors. On top of this, it also meansthat only the necessary amount of data are collected, stored, andtransmitted, thus freeing up network bandwidth.

On the other hand, such an adaptive sampling rate generatesa new set of challenges for existing data modeling and anal-ysis methods. Existing models largely work on a consistent datasampling rate.With an adaptive sampling rate, the models mightwork under certain scenarios and assumptions, but their perfor-mance will be unpredictable. In this paper, we introduce a novelAdaptive Classification System (ACS) architecture that is de-signed to cater for as well as actively adapt the sampling fre-quencies of the sensors, while maintaining the fault identifica-tion performance of the health monitoring system.In summary, our ACS is constructed as follows. First, a set of

pre-determined sampling rates, low and high, is identified, andmodels are trained on data collected based on these predefinedsensor sampling frequencies. Next, a set of controls is set-up toscreen the monitoring results of the models based on the cur-rent sampling rate. Lastly, the monitoring results could triggerchange in the sampling rate of the sensor as necessary, and inturn, each change in the sampling rate could also trigger changein the model employed to monitor the data. This ensures that ahigh sampling rate of sensor data is only collected and utilizedwhen necessary.The adoption of our ACS model provides a more logical, so-

phisticated mechanism to adapt the sampling frequencies of thesensors. It allows for more advanced condition monitoring para-digm, and for suitable architecture to be configured to adapt thesensors’ sampling frequencies based on the perceived systemstatus. In addition, it fulfills the need and requirements of re-ducing the power consumption and network bandwidth utiliza-tion.Note that previous research on machine fault diagnostics em-

ployed traditional supervised learning techniques for classifica-tion that typically rely on large amounts of labeled examplesfrom predefined classes for their learning process.. In practice,this approach is not practical because collecting and labelinglarge sets of data for training are often very expensive if not im-possible [23]. Sometimes, the data from a certain class of prob-lems could be rare because most of the time machines operateunder normal conditions (denoting the normal class), and onlya few examples (representing machines breaking down, or thedefective class) are available to train a model to recognize anabnormal status [24]. As such, we also propose an ensemblebased learning method to find an optimal solution for the realsituations where we have only a small set of training data.We observe that the proposed ensemble based classifier

offers our ACS system much better performance measurescompared to the individual traditional state-of-the-art clas-sifiers such as multilayer perceptron (MLP), support vectormachine (SVM), or k-nearest neighbors (KNN). This resultoccurs because, through combining the outputs of several clas-sifiers, we are able to minimize the potential bias and risk ofindividual predictions, and the expected errors by our ensembleapproach can be expected to be reduced. One similar examplein a real life medical diagnosis domain is that people consultmany doctors before undergoing major surgery, and then basedon their diagnosis they can get a fairly accurate diagnosis.The rest of the paper is organized as follows. Section II

describes some of the state-of-the-art sensor-based monitoringmethodologies. We investigate their performance measures in

NGUYEN et al.: ENSEMBLE BASED REAL-TIME ACS FOR INTELLIGENT SENSING MACHINE DIAGNOSTICS 305

utilizing sensor readings of different sampling rates. We alsodiscuss the pros and cons of real-time sampling rate adjustmentof the sensors. Section III presents a detailed description of thearchitectural design of our proposed ACS model that aims toaddress the issues and harness the power of adaptive sensorsampling rate. We also describe our ensemble learning methodin details. In Section IV, a use case scenario is presented todemonstrate the applications of our ACS model as well asillustrate the effectiveness of our proposed techniques throughcomprehensive experiments. Finally, Section V concludes thispaper.

II. EXISTING APPROACH

Themain procedure for sensor-based data modeling and anal-ysis is summarized as follows. 1) Suitable sensors (vibration,temperature, humidity, etc.) are selected and strategically posi-tioned on the subject of study. 2) The sensors are then configuredto operate at a suitable sampling rate along with other sensingparameters. 3) The layout of the sensor network is then drawnup to decide on locations, which might contain the critical in-formation for monitoring purposes. 4) Data are then accumu-lated over a period of time, and stored for study and analysis.5) Preprocessing (e.g. fill in missing values, smooth noisy data,identify or remove outliers, and resolve inconsistencies) [22]and signal transformation are performed via algorithms such asFast Fourier Transform (FFT) for frequency domain analysis, orother time and time-frequency based analysis to obtain featuresthat can be used to construct the data models and classifiers.6) Training of the selected models and classifiers would thentake place. 7) Results and performance measures of the modelsare then analysed and studied, together with the subject matterexperts, to derive insights into the process, to understand theunderlying fault behavior of the system, or to even exploit theknowledge discovered to optimize the manufacturing process.Some common scenarios of applications that are based on the

above-mentioned approaches are in manufacturing, healthcaresystems, home for the elderly, aerospace, disaster prediction andprevention, and intelligent transportation systems. To date, mostof such applications employ a constant setting of sensor param-eters, including the sensor’s sampling rate. While this approachis acceptable for short monitoring duration, it is not desirablewhen monitoring is to be performed over an extended contin-uous period of time.The lack of sensor’s sampling rate adaptation would signify

1) an increase in the power consumption, 2) a reduction in thesensor lifespan, and 3) an increase in the network traffic incurredto transfer the collected data. There has been consistent effort toaddress these three problems by the WSN research community.Such works are important as some sensors are due to be placedin environments where it is impossible to connect them to apower source.To address the issues mentioned above, we propose a novel

ACS architecture for system health monitoring. We will intro-duce our proposed ACS architecture in details in the next sec-tion. An example of the useful applications of ACS would bethe sensor-based monitoring of the machines in a manufacturing

Fig. 1. Setup of Adaptive Classification System. Classifier #1 differentiatesbetween 0—Normal and 1—Faulty states. Classifier #2 differentiates among0—Normal, Status 1—Inner fault, Status 2—Outer fault, Status 3—Ball fault,and Status 4—Combination fault.

process. In such cases, if the machine is not in operation (pow-ered off, idle, or under maintenance) or in a functional condi-tion, it would be desirable to either switch off the sensors, or tosample sensor readings at a significantly lower sampling rate.Sensors powered by alternative power sources (e.g. batteries)

would have a limited life span. To assure the functionality ofsensors, periodic checks and maintenance are needed. For thescale of the sensor deployment environments such as hi-techmanufacturing, the sheer effort required to maintain the sensorsis tremendous. Thus, if a sensor works at a relatively low sam-pling rate under a normal status, and only increases the samplingrate when necessary, extending its lifetime and lowering its costcan be achieved. Due to these needs and requirements, we be-lieve that there will be an increase in the adoption of WSN withadaptive sampling rates. It is therefore imperative that a suit-able modeling and monitoring system is developed which willnot only react to the changes in the sensor’s sampling rate butalso to take advantage of it.

III. PROPOSED SOLUTION

This section presents our proposed solution: the ACS archi-tecture. ACS reacts to different sampling rates, and reaps thebenefits of the adaptive sensor sampling rate as applied in aman-ufacturing environment.In the existing studies on the adaptive sensor sampling rate

strategy [9], [15]–[20], the changes in the sampling rate of thesensors are mostly governed by a predefined rule based system.Such an arrangement is not suitable for complexmonitoring sce-narios or sensor setup, as the decision to change the sensors’sampling rate is logically dependent of the predefined rules, andnot representative of the actual machine conditions.Rather than designing the whole system as a reactive system,

which switches the models to be used according to the sensors’sampling rate, our ACS model adopts a proactive approach. Inour model, the required sampling rate of the sensors is decidedby the monitoring output of the model. This design is signifi-cantly different from the existing approach. Each of the modelsand classifiers embedded within the ACS will provide a mon-itoring and classification result based on the sensor readings.One scenario is illustrated in Fig. 1. In this setup, we employtwo models, namely classifier #1 and classifier #2, where clas-sifier #1 is trained with a low sampling rate data, but classifier#2 is trained with a high sampling rate data.While the machine is in a normal state, the sensors would

be sampling at a lower rate. The first classifier #1 is to be usedat times when the sampling rate is low, and its task is to mon-itor and classify machine conditions among two states, namely,

306 IEEE TRANSACTIONS ON RELIABILITY, VOL. 61, NO. 2, JUNE 2012

0—Normal, and 1—Faulty. When this classifier #1 outputs amonitoring result of 1, which corresponds to a faulty state, itprompts the sensor node to switch to the higher sampling rate totrigger the in-depth investigations. This approach not only al-lows more detailed information to be collected, it also allowsdetailed fault identification to be performed by the second clas-sifier.This second classifier #2 differentiates among Normal, fault

status 1: Inner Race fault, fault status 2: Outer Race fault,fault status 3: Ball fault, and fault status 4: Combination fault.Here the combination fault has combined three different faults,namely, inner race fault, outer race fault and ball fault. This de-tailed analysis and identification of fault is only possible whenthe model has been trained with data with higher granularity.When the fault has been rectified and the machine returned tonormal state, the ACS will inform the sensor node switch backto a lower sampling rate.By assigning the control of changing the sampling frequen-

cies to our ACS model, we allow a more sophisticated, precisediagnostic process on when to intelligently change the samplingfrequencies. In addition, whenever necessary, the models andclassifiers can be trained to provide in-depth monitoring andanalysis of machine health status like in the case of the secondmodel. With this functionality, the proposed ACS model canlower the network traffic and power consumption. Thus, themodel can extend the usability lifespan of battery-powered sen-sors when applied to real-world applications, such as contin-uous health-monitoring of aerospace equipment (engines, bear-ings, gear boxes etc), to enable the assessment of system statusand prevent catastrophic failures. In addition, manufacturingalso needs to equip sensors to collect data for monitoring thesystem health to prevent machines from breakdown. Other ap-plications include disaster monitoring systems (e.g. fire alarm),patient health monitoring, and environmental monitoring (forchemical plant accidents, air and water quality), etc.Another useful feature that can be incorporated within our

system would be a feature to allow for asynchronous alert sys-tems for the end users based on the monitoring results. As themonitoring results are captured and sent to the sensor node forsampling rate adjustment, messages can also be sent to web ser-vices or other devices to notify the end users of the machinestatus accordingly. This ability also enables other more fac-tory-specific alert actions such as messaging, lighting control,and alarms to be activated based on machine health status.In ASC, one challenge is how to build the accurate models

and classifiers. Typically, supervised learning methods areapplied to learn these models from labeled training data.As discussed in the Introduction section, current supervisedlearning methods for fault diagnostics have to rely on the suffi-cient training examples to build accurate classifiers. However,obtaining large amounts of training data can be a labor-inten-sive, time-consuming process, and it also highly depends on theexpertise of domain experts. Thus, in many real-world applica-tions, we have to face the challenge of learning with the limitedtraining examples. In this paper, we build our ensemble-basedclassifier which integrates multiple state-of-the-art classifierssuch as MLP, SVM, KNN, etc., where each of them can betrained using the features extracted from the raw sensor signals.

To minimize the bias and risk of individual classifiers, we in-troduce an ensemble based learning method to our ACS model.The strategy of the ensemble systems is to create many classi-fiers, and combine their outputs such that the combination im-proves upon the performance measures of single classifiers. Theintuition is that if each classifier makes different errors, then astrategic combination of these classifiers can reduce the totalerror. The overall principal in ensemble systems is thereforeto make each classifier as unique as possible, particularly withrespect to misclassified instances [25]. In this paper, our en-semble-based ASC model is built by integrating multiple, di-verse classifiers, i.e. MLP, SVM, KNNmodels for time domain,and MLP, SVM, KNN models for frequency domain, and con-sists of two steps, i.e., compute confidence scores, and integratefor prediction.For each machine state , the confidence score of a model

is computed as

(1)

This estimated accuracy can be computed using the trainingdata with -fold cross-validations (CV). In particular, in an-fold CV, we can randomly divide the training data intoequal partitions; and each time, one partition is reserved

for testing, and the remaining partitions are used forcomputing accuracy for that particular round. This processcan be repeated times until each partition has been used asa test partition once. By averaging the accuracies, a goodestimation of can be computed. Note that the higher theconfidence score , the more confident the classifier can beused to classify the state .The confidence scores computed in the first step indicate that

some classifiers are more qualified than others in classifying aparticular data set. In the following step, we integrate the clas-sifiers by accumulating the voting. During the voting, we willtake the confidence scores into consideration. In particular, thefinal machine state prediction of our ensemble systemis the state that satisfies

(2)

We will show in our experiments that the proposed ensemblebased classifier indeed achieves much better performance mea-sures compared to the individual traditional state-of-the-art clas-sifiers.The next section will discuss and evaluate the performance of

our ACS model, as well as ensemble learning method based ondata collected from sensors operating under varying samplingrates.

IV. USE CASE SCENARIO

In this section, we present use-case scenarios employedto validate the effectiveness of our proposed system. InSection IV-A, we explain our experimental setup, data col-lection process, various feature extraction methodologies,and learning models and classifiers, as well as the evaluation

NGUYEN et al.: ENSEMBLE BASED REAL-TIME ACS FOR INTELLIGENT SENSING MACHINE DIAGNOSTICS 307

Fig. 2. Main components of the MFS-Lite Fault Simulator.

metric. In Section IV-B, we perform comprehensive experi-mental studies to investigate the classification performancemeasures for binary classification, multi-class classifica-tion, ACS without ensemble-based and with ensemble-basedlearning for intelligent sensing machine fault diagnostics.

A. Experimental Setup, Data Collection, Feature Extraction,Classification Model, and Evaluation Metric

1) Machinery Fault Simulator: Due to the lack of real ma-chinery data to validate our proposed system, an industrial Ma-chinery Fault Simulator (MFS) is employed in our study to gen-erate the different sets of normal, idle, and faulty machine data.The MFS that we employed is the MFS Lite from SpectraQuest,which provides an innovative tool for studying signatures ofcommon machinery faults without having to compromise fac-tory production or profits. MFS allows us to introduce variousfaults, either individually or jointly in a controlled environment,and is ideal as a test bed for our system.The MFS setup is shown in Fig. 2. The hardware components

of the MFS consist of a motor, two balance rotors, speed con-troller, tachometer display, and two bearing housings. To simu-late the effect of different bearing faults, the good bearing of theMFS is subsequently replaced with a known defective bearing.Based on the bearing geometry, MFS provides the followingfour types of faulty bearings: outer race defects (outer), innerrace defects (inner), ball spin defects (ball), and combined de-fects (combination) faults.Data Collection: In our data collection process, the sensors

are placed near the bearing housing of the defective bearingto obtain the most pronounced fault signatures in the vibrationsignal ( Fig. 3). Two channels of accelerometer readings,which capture the horizontal and vertical vibration signals, areacquired for each simulation. The fault simulations are donein three different machine operating speeds: 800 rotations perminute (rpm), 1780, rpm and 3600 rpm. For each machinespeed, 100 of 1 second window samples are collected underthree sampling frequencies: 256 Hz (low), 1024 Hz (medium),and 5120 Hz (High). A one-second snapshot of the vibrationsignals collected via the front right sensor is shown in Fig. 4.

Fig. 3. Experimental setup. The sensors are placed in the red circles respec-tively.

Fig. 4. Snapshot of vibration signals collected via the right front channel.

In this study, we investigate the performance of the ACS,which reacts to different sampling rates, and reaps the bene-fits of the adaptive sensor sampling rate as applied to diagnosefaulty bearing conditions based on vibration data collected viaaccelerometers. Five datasets, each representing different ma-chine states (normal, ball fault, inner fault, outer fault, and com-bination fault), are collected with different sampling rates foreach machine operating speed. This summed up into a total of1500 seconds worth of sensor data; and among them, 50% of thedata are used for training the models and classifiers, and 50% ofthe data are used for testing purposes. The data are subsequentlywindowed for analysis. A one-second snapshot of the vibrationsignals collected via the right front channel is shown in Fig. 4.Feature Extraction: Vibration monitoring and analysis is a

commonly used technique for bearing fault detection. There

308 IEEE TRANSACTIONS ON RELIABILITY, VOL. 61, NO. 2, JUNE 2012

Fig. 5. Frequency domain features for bearing detection.

are two main approaches to the analysis of bearing vibrationsignatures: time-domain, and frequency-domain analyses[21]. In time-domain analysis, four time-domain featuresare extracted from the raw vibration signal to characterizeeach of the bearing faults. They are peak value, kurtosis,Root-Mean-Squared (RMS) value, and standard deviation.They are computed as in (3) –(6).1 Peak Value

(3)

2 Root-Mean-Square (RMS) Value

(4)

3 Standard Deviation

(5)

4 Kurtosis Value

(6)

These four features are chosen as they have been widely usedto characterize bearing faults in industrial machines [1].The frequency-domain analysis approach, on the other hand,

involves examining the vibration spectrum to discover the pres-ence of defect frequencies in the vibration signal [21]. In thisstudy, we calculate the band magnitude spectrum for each seg-ment of the vibration signal, and use this as the set of featuresfor learning different machine states, as shown in Fig. 5. Specif-ically, the raw vibration signal is first normalized to the [ 1, 1]range using the formula

(7)

Fast Fourier Transform (FFT) is then used to transform the nor-malized data from the time domain to the frequency domain.Finally, the input signal in the frequency domain is transformedinto discrete band-magnitude features with the bandwidths of32 Hz, 50 Hz, and 100 Hz.ClassificationModels: For the data modeling andmonitoring

(classification) process, three classifiers are employed: MLP,SVM, and KNN. Each classifier is applied to both the time-domain and frequency-domain to build different classification

models for bearing fault detection. This results into six bearingdetection models, namely, MLP for the time domain ,MLP for the frequency domain , SVM for the time do-main , SVM for the frequency domain , KNNfor the time domain , and KNN for the frequency do-main .2) Evaluation Metric: In this study, we use the F-measure

to evaluate the performance measures of different classificationmodels. The F-measure is the harmonic mean of precision (de-noted as ) and recall (denoted as ), and it is defined as

(8)

In other words, the F-measure reflects an average effect of bothprecision and recall . The F-measure is large only whenboth precision and recall are good. This is suitable for our pur-pose to accurately classify each particular machine state. Havingeither too small a precision or too small a recall is unacceptable,and would be reflected by a low F-measure.Note that in this paper we will compute the F-measure for

each class or machine state. We will report both the F-measurefor each class, as well as the average F-measure of all the classesand states, which represents the overall performance measure.

B. Comprehensive Studies of Bearing Fault Detection

In this subsection, we perform comprehensive experimentsto evaluate the performance measures of four different typesof classification models: binary classification, multi-class clas-sification, ACS system without ensemble-based learning, andensemble based ACS system for intelligent sensing machinefault diagnostics. To provide a holistic picture of these tech-niques, we build these models using different sampling rates(low frequency 256 Hz, medium frequency 1024 Hz, and highfrequency 5120 Hz), with different machine operating speeds(800 RPM, 1780 RPM, and 3600 RPM), and with differentfeature representation methods (for both time domain and fre-quency domain feature extractions). In each of the followingtesting cases, the classification models are trained and evaluatedwith data collected on their respective sampling frequencies.1) Binary Classification: We investigate the performance

measures of the six trained models ( , , ,, and ) when they are employed to detect

a machine-bearing’s binary conditions, i.e. whether the machineis currently operating under a healthy condition, or whether afault or a combination of faults has happened. Therefore, forthe purpose of this study, all the faulty bearing states mentionedabove (i.e. outer, inner, ball, and combination faults) are com-bined as a single label of “Defective” or “Fault” class, while thenormal state corresponds to the “Normal” class.The three models’ detection and classification performance

measures of time domain models ( , , and) for three machine operating speeds (800 RPM, 1780

RPM, and 3600 RPM), and three sampling rates (low, medium,and high) are shown in Table I. We observe that bothand are able to achieve 100% of the F-measure for alloperation speeds with all sampling rates. This result indicatesthat the healthy bearing condition is well differentiated with

NGUYEN et al.: ENSEMBLE BASED REAL-TIME ACS FOR INTELLIGENT SENSING MACHINE DIAGNOSTICS 309

TABLE IBINARY CLASSIFICATION PERFORMANCE MEASURES OF TIME DOMAIN MODE

TABLE IIBINARY CLASSIFICATION PERFORMANCE MEASURES OF FREQUENCY DOMAIN MODE

general defective conditions using time domain features. TheKNN based model , as expected, performed worse thanthe other two strong classifiers and .The other three models’ detection and classification per-

formance measures for frequency domain models ( ,, and ) for three machine operating speeds (800,

1780, and 3600), and three sampling rates (low, medium, andhigh) are shown in Table II. The three frequency domain modelsobtain good results using high sampling rates, but present quitepoor results when we apply low sampling rates. Such resultsare not surprising as some of the bearing faults can only bediscovered in the signals with high sampling rates.2) Multi-Class Classification: Binary classification can only

distinguish between the machine’s normal and faulty states.More often than not, in practice, we may need to identifythe specific types of the faults. As such, we investigate theperformance measures of the six models and classifiers inrecognizing the types of bearing faults. In particular, we havefive classes in our classification: normal, outer fault, inner fault,ball fault, and combination fault. In this paper, we build themulti-class classifiers for all three machine operating speedsand all the three sampling rates. Note that the multi-class SVMis implemented using the library for support vector machines(LIBSVM) [26].The evaluation results are shown in Tables III and IV for

time domain, and frequency domain modes respectively. Com-

pared to the classification models of time domain, the models ofthe frequency domain with high sampling rates produce consis-tently better classification results across three machine speeds.In particular, SVM for the frequency domain modelwith high sampling rates (5120 Hz) archives average F-mea-sures of 99%, 99%, and 100% for the three machine speeds 800rpm, 1780 rpm, and 3600 rpm, respectively, which are signifi-cantly better than the results with low sampling rates, indicatingthat a high sampling rate would yield better diagnostics perfor-mance. We also observe that both and (MLPand KNN with frequency domains) are able to achieve satisfac-tory results using a high sampling rate (5120 Hz), while they areslightly worse than the model.3) ACS System Without Ensemble-Based Learning: In this

case study, we evaluate the performance of our ACS systemwithout ensemble-based learning, where the classificationsystem will adapt the sampling frequencies of the sensorsaccording to the observed machine condition. When the sam-pling rate changes, the ACS will consequently employ thecorresponding classifier that is best suited for the samplingfrequencies to perform the next classification.In our test case, when the machine is in the normal state,

the sampling rate of the sensors is set to be 256 Hz. The lowsampling rate time domain is trained to differentiate be-tween the normal and faulty machine states. This classifier isthe one that has been trained in the binary classification study,

310 IEEE TRANSACTIONS ON RELIABILITY, VOL. 61, NO. 2, JUNE 2012

TABLE IIIMULTI-CLASS CLASSIFICATION RESULTS OF TIME DOMAIN MODE

TABLE IVMULTI-CLASS CLASSIFICATION RESULTS OF FREQUENCY DOMAIN MODE

where it has an average of 100% in terms of the F-measure (seeTable I). When the machine condition has been identified to bein a faulty state, the ACS will inform the sensor node to switchto a higher sampling rate of 5120 Hz. Subsequently, the nextmonitoring task will be executed by the appropriate model, inthis case the model, which is trained under a 5120 Hzdata sampling rate using frequency domain features.

From Table V, we can see that our ACS achieves very goodresults with 100% F-measures for all machine states with a ma-chine speed of 3600 rpms. In fact, only four numbers in Table Vare not 100%: for the outer faults, we have 97.4% for both ma-chine speeds 800 rpm and 1780 rpm; for ball, and combinationfaults we have 97.6% formachine speed 800 rpm, and 1780 rpm,respectively. The overall average results are 99% for machine

NGUYEN et al.: ENSEMBLE BASED REAL-TIME ACS FOR INTELLIGENT SENSING MACHINE DIAGNOSTICS 311

TABLE VADAPTIVE CLASSIFICATION RESULTS WITHOUT ENSEMBLE-BASED LEARNING

TABLE VICLASSIFICATION RESULTS OF THE PROPOSED ENSEMBLE BASED CLASSIFIER

speed 800 rpm, and 1780 rpm; and 100% for machine speed3600 rpm.Moreover, by applying the ACS system, we are able to ad-

dress the issues of 1) extending usability of battery poweredsensors via better power efficiency, 2) reducing network trafficload, 3) allowing classifiers to maintain performance while sam-pling frequency varies, and 4) allow for a better, higher level ofcontrol system over the change in sampling rate via classifiers.4) Ensemble Based ACS System: Finally, we study whether

our proposed Ensemble based ACS system can enhance theclassification results, especially when we have relatively lesstraining examples available. For this purpose, we changed the

percentages for training and testing in the first two columns inTable VI. We listed the results of six individual classifiers andour proposed ensemble basedmethod (represented by Ensemblein Table VI) with different machine speeds (here we fixed themedium sampling rate at 1024 Hz in our experiments). We ob-serve that our proposed ensemble based method produces con-sistently better classification results across threemachine speedscompared to those of individual classifiers, regardless of the per-centages of training and test data. When we have relatively lesstraining data, our proposed method can achieve bigger improve-ments, indicating that we can effectively minimize the potentialbias and risk of individual classifiers, and thus reduce the ex-

312 IEEE TRANSACTIONS ON RELIABILITY, VOL. 61, NO. 2, JUNE 2012

pected errors. For example, for the challenging situation with5% training data and 95% test data, we can achieve 6.70%,9.06%, and 10.89% better results than the second best modelsfor machine speeds 800 rpm, 1780 rpm, and 3600 rpm, respec-tively.In summary, the proposed ensemble based ACS model out-

performs individual methods in various training and test datapercentages in term of prediction accuracy, robustness, and sta-bility. As pointed out in [25], the performance measure of anensemble system is not always better than that of the best indi-vidual classifier. It mainly depends on two key components ofthat ensemble system. They are the diversity of the individualclassifiers, and the strategy to combine the output of these clas-sifiers. In this paper, the superiority of our proposed ensemblebased ACS model comes from its six diversity classifiers, aswell as our effective integration mechanism, which can reducethe classification errors by eliminating the potential bias and riskof individual classifiers.Recall that we have employed -fold cross validation for es-

timating the confidence scores for each individual classifiers fordifferent states. Given that we could have very limited trainingsamples when the training percentage is small (e.g. 5%), wechoose a small to avoid tiny partitions in our implemen-tation.

V. CONCLUSION

The deployment and usage of sensors for system health mon-itoring are common and widespread in modern manufacturing.Their main objective is to provide a non-intrusive method for re-mote monitoring of machine performance. Hence, the reductionin power consumption in sensor-based machine health moni-toring is undoubtedly of exceptional benefit to the manufac-turing domain.With the decline in the cost, and technical barrier to the in-

stallation and setup of WSN, more sensors are being added anddeployed onto manufacturing floors. As the number of sensorsdeployed increases, there is the need to maximize the usabilitylifespan of battery powered sensors, and reduce the amount ofnetwork traffic bandwidth consumed by the transmission ofsensor data.The recent developments in the sensor community to address

the above-mentioned issues have generally moved in two maindirections. The first direction is to identify and create new al-gorithms that allow for a reduction in sensors’ sampling ratewhile maintaining the resolution and monitoring capability ofthe system. The second approach attempts to establish the capa-bilities of the sensor nodes to adapt the sampling frequencies ofits attached sensors according to the requirements.In light of such recent developments, we have developed the

ensemble based ACS approach to take advantage of these latestsensor capabilities. We have developed a system that integratesvarious models and classifiers trained under different scenarios,and adapt the sampling frequencies of the sensors according tothe observed machine status. We have also demonstrated the useof such an ACS configuration in monitoring the health statusof a machine. The results produced have shown that such anACS configuration is able tomaintain the accuracies of its health

monitoring effort while at the same time reduce the sensors’data rate, as compared to when a constant high sampling rateis employed for the monitoring task.As part of our future work, we will investigate the optimal set

of sensor’s sampling frequencies range for which models andclassifiers have to be trained on. It will be highly impracticalto require models to be trained at all possible sampling frequen-cies. This effort is such that we will be able to avoid overloadingthe ACS with too many models and classifiers.

REFERENCES

[1] K. L. Tew, S. D. Teddy, P. Y. Watt, and X.-L. Li, “A systematicapproach towards manufacturing health management,” Prognosticsand System Health Management Conference (PHM-Shenzhen), pp.1–6, May 24–25, 2011.

[2] G. Niu, D. Anand, and M. Pecht, “Prognostics and health managementfor energetic material systems,” Prognostics and Health ManagementConference, 2010. PHM ’10, pp. 1–7, January 12–14, 2010.

[3] S. Massam and S. McQuillan, “Verification of PHM capabilities: Ajoint customer/industrial perspective,” in Aerospace Conference Pro-ceedings, 2002. IEEE, 2002, vol. 6, pp. 2799–2813.

[4] C. Phua, V. S.-F. Foo, J. Biswas, A. Tolstikov, A.-P.-W. Aung, J.Maniyeri, M.-H. That, and A. K.-W. Chu, “2-layer Erroneous-PlanRecognition for dementia patients in smart homes,” in 11th Interna-tional Conference on e-Health Networking, Applications and Services,2009. Healthcom 2009, December 16–18, 2009, pp. 21–28.

[5] K. Sim,G.-E. Yap, C. Phua, J. Biswas, A.-P.-W.Aung, A. Tolstikov,W.Huang, and P. Yap, “Improving the accuracy of erroneous-plan recog-nition system for activities of daily living,” in e-Health NetworkingApplications and Services (Healthcom), 2010 12th IEEE InternationalConference on, July 1–3, 2010, pp. 28–35.

[6] S. K. Sahoo, W. Lu, S. D. Teddy, D. Kim, and M. Feng, “Detection ofatrial filbrillation from non-episodic ECG data: A review of methods,”inAnnual International Conference of the Engineering inMedicine andBiology Society(EMBC), 2011.

[7] M. Feng, Z. Zhang, F. Zhang, Y. Ge, L. Y. Loy, K. Vellaisamy, W.Guo, P. L. Chin, N. K. K. King, B. T. Ang, and C. Guan, “iSyNCC:An intelligent system for patient monitoring & clinical decision sup-port in neuro-critical-care,” in Annual International Conference of theEngineering in Medicine and Biology Society(EMBC), 2011.

[8] M. V. Ramesh and P. Ushakumari, “Threshold based data aggregationalgorithm to detect rainfall induced landslides,” in Proceedings of 2008International Conference on Wireless Networks (ICWN), 2008, vol. 1,pp. 255–261.

[9] C. Alippi, G. Anastasi, C. Galperti, F. Mancini, and M. Roveri, “Adap-tive sampling for energy conservation in wireless sensor networks forsnow monitoring applications,” in IEEE International Conference onMobile Adhoc and Sensor Systems, 2007. MASS 2007, October 2007,pp. 1–6.

[10] D. Jiang and C. Liu, “Machine condition classification using deteriora-tion feature extraction and anomaly determination,” IEEE Transactionson Reliability, vol. 60, no. 1, pp. 41–48, March 2011.

[11] Y. Z. Rosunally, S. Stoyanov, C. Bailey, P. Mason, S. Campbell, G.Monger, and I. Bell, “Fusion approach for prognostics framework ofheritage structure,” IEEE Transactions on Reliability, vol. 60, no. 1,pp. 3–13, March 2011.

[12] Y. Zhang, G. W. Gantt, M. J. Rychlinski, R. M. Edwards, J. J. Correia,and C. E. Wolf, “Connected vehicle diagnostics and prognostics, con-cept, and initial practice,” IEEE Transactions on Reliability, vol. 58,no. 1, pp. 286–294, 2009.

[13] T. J. Sheskin, “Fault diagnosis with imperfect tests,” IEEE Transac-tions on Reliability, vol. R-30, no. 2, pp. 156–160, 1981.

[14] P. Yip, “Estimating the number of errors in a system using a martin-gale approach,” IEEE Transactions on Reliability, vol. 44, no. 2, pp.322–326, 1995.

[15] W. R. Dieter, S. Datta, and K. K. Wong, “Power reduction by varyingsampling rate,” inProceedings of the 2005 International Symposium onLow Power Electronics and Design, 2005. ISLPED ’05, August 8–10,2005, pp. 227–232.

[16] Y. Imaoka and Y. Sanada, “Experimental Investigation of SamplingRate Selection with Fractional Sampling for IEEE802.11b WLANSystem,” in 2009 IEEE 70th Vehicular Technology Conference Fall(VTC 2009-Fall), September 20–23, 2009, pp. 1–5.

NGUYEN et al.: ENSEMBLE BASED REAL-TIME ACS FOR INTELLIGENT SENSING MACHINE DIAGNOSTICS 313

[17] I. Crk, F. Albinali, C. Gniady, and J. Hartman, “Understanding energyconsumption of sensor enabled applications on mobile phones,” in En-gineering in Medicine and Biology Society, 2009. EMBC 2009. An-nual International Conference of the IEEE, September 3–6, 2009, pp.6885–6888.

[18] M. Qiu and E. H. Sha, “Energy-aware online algorithm to satisfysampling rates with guaranteed probability for sensor applications,” inIEEE International Conference on High Performance Computing andCommunications, 2007, pp. 156–167.

[19] T. Kurp, R. X. Gao, and S. Sah, “An adaptive sampling scheme for im-proved energy utilization in wireless sensor networks,” in Instrumen-tation and Measurement Technology Conference (I2MTC), 2010 IEEE,May 3–6, 2010, pp. 93–98.

[20] R. Rieger and J. T. Taylor, “An adaptive sampling system for sensornodes in body area networks,” IEEE Transactions on Neural Systemsand Rehabilitation Engineering, vol. 17, no. 2, pp. 183–189, April2009.

[21] N. Tandon and A. Choudhury, “A review of vibration and acousticmeasurement methods for the detection of defects in rolling elementbearings,” Tribology International, vol. 32, pp. 469–480, 1999.

[22] J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques,3rd ed. New York: Morgan Kaufmann, 2011.

[23] M. N. Nguyen, X.-L. Li, and S.-K. Ng, “Positive unlabeled leaning fortime series classification,” International Joint Conferences on ArtificialIntelligence, pp. 1421–1426, 2011.

[24] H. Cao, X.-L. Li, Y. K. Woon, and S.-K. Ng, “SPO: Structure pre-serving oversampling for imbalanced time series classification,” inIEEE International Conference on Data Mining, 2011.

[25] R. Polikar, “Ensemble based systems in decision making,” IEEE Cir-cuits and Systems Magazine, vol. 6, pp. 21–45, 2006.

[26] C.-C. Chang and C.-J. Lin, “LIBSVM: A library for support vector ma-chines,” Computer and Information Science, vol. 104, pp. 1–30, 2001.

Minh Nhut Nguyen received B.S. and M.S. degrees in computer engineeringfrom Ho Chi Minh City University of Technology, Vietnam, in 2001, and 2005,respectively; and the Ph.D. degree in computer science from Nanyang Techno-logical University, Singapore in 2008. He is currently a Research Scientist of theInstitute for Infocomm Research, A STAR, Singapore. His research interestsinclude machine learning for time series and stream data, pattern recognition,and neural networks.

Chunyu Bao received the Ph.D. degree in Computer Engineering fromNationalUniversity of Singapore, Singapore (2009). He is currently a Research Scientistin the Data Mining Department, Institute for Infocomm Research, Agency forScience, Technology and Research (A STAR). His current research interestsinclude artificial neural networks, emotion synthesis on 3D face models, andmachine learning on large scale data.

Kar Leong Tew received his Bachelor of Science degree (First Class Honors)from the University Of Exeter, UK in 2005. He is currently working as a SeniorResearcher in SAP AG working in the domain of Business Network Orchestra-tion. His research interests are in the domain of data mining, network analysis,and information visualization.

Sintiani Dewi Teddy (S’03–M’06) received both her B.Eng. (First ClassHonors), and Ph.D. degrees in computer engineering from Nanyang Techno-logical University, Singapore, in 2003, and 2008 respectively. She is currentlya Principal Investigator with the Data Mining Department of the Institute forInfocomm Research (A STAR), Singapore. Her current research interestsinclude the cerebellum and its computational models, artificial neural networks,the study of brain-inspired learning memory systems, computational finance,and autonomous control of bio-physiological processes.

Xiao-Li Li (M’08) is currently a principal investigator in the Data Mining De-partment at the Institute for Infocomm Research, Agency for Science, Tech-nology and Research (A STAR), Singapore. He also holds an appointment ofadjunct professor in the School of Computer Engineering, Nanyang Technolog-ical University. His research interests include data mining, machine learning,and bioinformatics. He has been serving as a member of technical programcommittees in the leading data mining and machine learning conferences KDD,ICDM, CIKM, SDM,AAAI, and PKDD/ECML etc. He is the Editor-in-Chief ofthe International Journal of Knowledge Discovery in Bioinformatics (IJKDB).In 2005, he received the best paper award in the 16th International Conferenceon genome informatics (GIW 2005). In 2008, he received the best poster awardin the 12th Annual International Conference Research in computational molec-ular biology (RECOMB 2008). In 2011, he received the Best Paper Runner-UpAward in the 16th International Conference on Database Systems for AdvancedApplications (DASFAA 2011). To learn more about Dr. Xiao-Li Li, please visithis Web page: http://www1.i2r.a-star.edu.sg/ xlli/.