Service-oriented architecture on the Grid for integrated fault diagnostics

12
CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCE Concurrency Computat.: Pract. Exper. 2007; 19:223–234 Published online 11 May 2006 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cpe.1047 Service-oriented architecture on the Grid for integrated fault diagnostics X. Ren 1, ,† , M. Ong 1 , G. Allan 1 , V. Kadirkamanathan 1 , H. A. Thompson 1,2 and P. J. Fleming 1,2 1 Department of Automatic Control and Systems Engineering, University of Sheffield, Sheffield S1 3JD, U.K. 2 Rolls-Royce University Technology Centre in Control and Systems Engineering, University of Sheffield, Sheffield S1 3JD, U.K. SUMMARY For industrial fault diagnostics, many model-based fault diagnosis approaches have been proposed so far and some of them have been put into practice. However, for modern complex processes, owing to the variable nature of faults and model uncertainty, no single method can diagnose all faults and meet different contradictory criteria. In this paper, the importance of integration of different fault detection and isolation schemes in a generic problem-solving environment is emphasized. A service-oriented architecture for the integration is proposed, based on Grid technologies. As an engineering implementation, a decision support system for the gas turbine engine fault diagnosis is presented and some deployed services are discussed. Copyright c 2006 John Wiley & Sons, Ltd. Received 9 January 2005; Revised 8 June 2005; Accepted 24 August 2005 KEY WORDS: service-oriented architecture; fault diagnostics; the Grid 1. INTRODUCTION In the aviation industry, a great deal of effort has been made to reduce the number of in-flight engine shutdowns, aborted take-offs and flight delays by using engine fault diagnosis and health monitoring technologies. Among these technologies, model-based approaches are promising modern Correspondence to: Xiaoxu Ren, Diamond Light Source Ltd, Diamond House, Rutherford Appleton Laboratory, Chilton, Didcot, Oxfordshire OX11 0QX, U.K. E-mail: [email protected] Contract/grant sponsor: U.K. EPSRC; contract/grant number: GR/R67668/01 Contract/grant sponsor: Rolls-Royce PLC Contract/grant sponsor: Data Systems & Solutions Copyright c 2006 John Wiley & Sons, Ltd.

Transcript of Service-oriented architecture on the Grid for integrated fault diagnostics

CONCURRENCY AND COMPUTATION: PRACTICE AND EXPERIENCEConcurrency Computat.: Pract. Exper. 2007; 19:223–234Published online 11 May 2006 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cpe.1047

Service-oriented architectureon the Grid for integratedfault diagnostics

X. Ren1,∗,†, M. Ong1, G. Allan1, V. Kadirkamanathan1,H. A. Thompson1,2 and P. J. Fleming1,2

1Department of Automatic Control and Systems Engineering,University of Sheffield, Sheffield S1 3JD, U.K.2Rolls-Royce University Technology Centre in Control and Systems Engineering,University of Sheffield, Sheffield S1 3JD, U.K.

SUMMARY

For industrial fault diagnostics, many model-based fault diagnosis approaches have been proposed sofar and some of them have been put into practice. However, for modern complex processes, owing tothe variable nature of faults and model uncertainty, no single method can diagnose all faults and meetdifferent contradictory criteria. In this paper, the importance of integration of different fault detection andisolation schemes in a generic problem-solving environment is emphasized. A service-oriented architecturefor the integration is proposed, based on Grid technologies. As an engineering implementation, a decisionsupport system for the gas turbine engine fault diagnosis is presented and some deployed services arediscussed. Copyright c© 2006 John Wiley & Sons, Ltd.

Received 9 January 2005; Revised 8 June 2005; Accepted 24 August 2005

KEY WORDS: service-oriented architecture; fault diagnostics; the Grid

1. INTRODUCTION

In the aviation industry, a great deal of effort has been made to reduce the number of in-flightengine shutdowns, aborted take-offs and flight delays by using engine fault diagnosis and healthmonitoring technologies. Among these technologies, model-based approaches are promising modern

∗Correspondence to: Xiaoxu Ren, Diamond Light Source Ltd, Diamond House, Rutherford Appleton Laboratory, Chilton,Didcot, Oxfordshire OX11 0QX, U.K.†E-mail: [email protected]

Contract/grant sponsor: U.K. EPSRC; contract/grant number: GR/R67668/01Contract/grant sponsor: Rolls-Royce PLCContract/grant sponsor: Data Systems & Solutions

Copyright c© 2006 John Wiley & Sons, Ltd.

224 X. REN ET AL.

approaches for aero engine fault detection and isolation (FDI). Model-based FDI is based on the ideathat measurements from dissimilar sensors are functionally related because they are all derived from thesame state of a system. Any violation of these relationships indicates the occurrence of faults. Althoughthe model-based approach is commonly accepted as a promising approach for fault diagnosis, owing tomodel uncertainties, demanding computational requirements and unknown complicated nature of faultdiversity, there is no single widely accepted generic solution for fault diagnosis. Researchers workingin this area have proposed different approaches of using different algorithms under the name ‘model-based’ [1–3]. Each approach, however, has its own focus and none of them is a universal approach,neither suitable nor available for all fault types. To overcome these shortfalls and exploit advantagesof different approaches, an integration of different methods or hybrid schemes are highly advocated byexperts [4,5]. The system architecture of such an integrative approach remains one major concern.

The modern aero engines are being instrumented with engine monitoring units possessingsignificantly greater capability to record and analyse data. Each engine on a civil airliner is capableof generating at least 1 gigabyte of data per flight. As a result, in future, one can envisage terabytes ofengine monitoring data being transmitted every day for analysis by a whole fleet of aircraft. Thus, thechallenge is not only to provide a set of fault diagnostic tools for FDI and high-level maintenancedecision support, but also to provide a suitable problem-solving infrastructure to link these tools,manage the large amounts of data and perform the high-performance computing to support these faultdiagnostic algorithms and decision making.

With the latest development in Internet and Intranet technologies, especially with the developmentof Grid computing, it is now possible to provide different algorithms as individual Grid services andcombine these services dynamically in a workflow, together with users and resources, to generatea ‘virtual organization’ for fault diagnostic purposes. In this paper, the importance of integration ofdifferent FDI schemes in a generic problem-solving environment is emphasized. A service-orientedarchitecture (SOA) on the Grid for integration is proposed. Different fault diagnosis algorithms andanalysis tools are provided as Grid services in this framework. Through configuration and workflowmanagement, these services can be dynamically invoked to form a flexible distributed decision supportenvironment for aero engine fleet fault diagnosis and maintenance.

This paper is organized as follows. Section 2 presents an overview of model-based fault diagnosisand different approaches developed for model-based fault diagnosis are summarized and compared.In Section 3, the concept of Grid computing is introduced. A SOA on the Grid for fault diagnosisis proposed and the advantages are highlighted. In Section 4, the Distributed Aircraft MaintenanceEnvironment (DAME) project is introduced and some experimental work on the FDI integration ispresented. The developed gas turbine engine simulation services for fault detection and case-basedreasoning services for high-level decision support are detailed.

2. MODEL-BASED FAULT DIAGNOSIS

Model-based methods are becoming a widely accepted approach for solving fault diagnosis problems.Model-based FDI focuses on dynamic consistency (parity) relations and parameter estimation.The basic procedure for using model-based FDI is firstly to generate analytic symptoms by usinganalytical knowledge about a dynamic process based on observation. Then the generated analyticsymptoms as well as heuristic symptoms are analysed at the fault diagnosis stage to find the type,size and location of a fault as well as its time of detection.

Copyright c© 2006 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2007; 19:223–234DOI: 10.1002/cpe

SOA ON THE GRID FOR INTEGRATED FAULT DIAGNOSTICS 225

In general, model-based FDI is a two-step procedure: residual generation and residual evaluation.Residual generation is a process in which the input and output of a dynamic system are monitoredand manipulated to generate a signal or vector, the so-called residual. The residual should be normallyzero or close to zero when no fault is present but is distinguishably different from zero when a faultoccurs. Residual generation is thus a procedure for extracting fault symptoms from the system, withthe fault symptom represented by the residual signal. The residual should ideally carry only faultinformation. To ensure reliable FDI, the loss of fault information in residual generation should beas small as possible.

Residual evaluation is the analysis of the residual to examine the likelihood of faults. A decisionis made based on the knowledge about the process and the symptoms. If a fault has occurred, moreanalysis should be made to isolate or even identify the fault. A decision process may consist of asimple threshold test on instantaneous values or moving averages of residuals, or it may consist ofmethods of more sophisticated decision theories.

From various approaches used for the residual generation and evaluation, there are roughly fourdifferent approaches used by model-based FDI.

• Observer approach or parity relations approach. The underlying idea of an observer approachis to estimate the system outputs from the available input and output of that system. The residualwill then be a weighted difference between the estimated and actual outputs. In a similar way,a parity relations approach is based either on a technique of direct redundancy, making use ofstatic algebraic relations between sensor and actuator signals or alternatively, upon temporalredundancy, when dynamic relations between inputs and outputs are used.

• Parameter estimation approach. This approach makes use of the fact that component faults ofa dynamic system can be thought of as reflected in the physical parameters of a system, forexample, friction or mass velocity resistance. A fault can then be detected through parameterestimation or model identification.

• Statistical approaches. Mainly used to improve the fault detection capability, statisticalapproaches such as the generalized likelihood ratio (GLR) test can be used to find changes ofa residual signal more quickly and accurately. Principle component analysis (PCA) and Fisherdiscriminant analysis (FDA) are among the most widely used techniques for dimensionalityreduction and pattern classification on FDI.

• Qualitative approach. The qualitative approach is based on the concept of a qualitativemodel that unlike its quantitative counterpart only requires declarative (heuristic) information.An expert system, for example, is one of the qualitative approaches that use if–then rules torepresent the human knowledge of the relation between the normal/abnormal system behaviourand the causes of faults. The fault tree approach traces the evolution of the fault throughthe dynamic system described by a fault tree, event trees or causal networks. There are alsoqualitative model-based approaches that use a qualitative model derived directly from physicallaws of the system under consideration. Bond graphs and Petri nets, for example, can be used toassist this modelling purpose.

In general, a fault diagnosis technique should be able to complete the following two main tasks:

• detect and isolate different faults occurring in a dynamic system, which include sensor faults,actuator faults and internal faults of a controlled system;

• detect and isolate incipient faults as well as abrupt faults.

Copyright c© 2006 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2007; 19:223–234DOI: 10.1002/cpe

226 X. REN ET AL.

Table I. A comparison of different model-based approaches for FDI.

FDI scheme SuitabilityPromptness and

sensitivityDesign and

implementationModelling and

robustness

Observer andparity relationapproaches

More suitable forfault detection andisolation in sensorand actuator.

Reaction to bothabrupt and incipientfault is fast.Sensitive to sensorfault.

Design procedure issystematic andsimple. Easy toimplement.

Nonlinear observeris difficult todesign. Maturetechniques areavailable for robustobserver.

Parameterestimationapproach

Suitable for faultdetection andisolation of systemcomponents.

Reaction to abruptfaults is slow. Moresuitable forincipient faults.

Design procedure issystematic but notsimple. Difficult toimplement.

Nonlinear systemparameterestimation ispossible to handlebut robustnessdepends onmethods used.

Statisticalapproach

Suitable for bothfault detection andisolation.

Depend on methodsused, reaction toabrupt faults couldbe slow.

Design procedure issystematic. Easy toimplement.

Kernel densityestimation ispossible to handle.Very robust.

Qualitativeapproach

Suitable for faultisolation.

Reaction toincipient faults isusually slow.

Design procedure isdifficult. Easy toimplement.

Suitable symbolicmodel is not easy toobtain. Very robust.

A fault diagnosis scheme should also consider following criteria for better fault diagnosis performance:

• promptness of fault detection;• sensitivity to incipient faults;• false alarm rate and missed fault detection;• incorrect fault identification.

Although various model-based approaches are designed to solve fault diagnosis problems, eachapproach has its own focus and none of the above-mentioned methods are a generic solution forcomplex modern system fault diagnosis. This is due to the complicated nature of the monitored systemand faults, the applicability of different modelling approaches or the insufficient knowledge about themonitored process. Table I is a summarized comparison of different selected model-based approachesfor fault diagnosis. From this comparison, it is clear that none of these approaches can fully satisfyall requirements of modern fault diagnosis such as promptness, accuracy and sensitivity to faults. It iscommonly agreed that hybrid schemes would provide better solutions for future complex system faultdiagnostics [6,7]. Thus an evaluation of different FDI approaches in varied operation/fault scenariosand a suitable mechanism for integration of assorted FDI approaches in an open computationalenvironment are crucial. In addition, it is important to consider how an integrative approach should

Copyright c© 2006 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2007; 19:223–234DOI: 10.1002/cpe

SOA ON THE GRID FOR INTEGRATED FAULT DIAGNOSTICS 227

be used to provide the most accurate diagnosis and better maintenance suggestions in the decision-making process.

Another restriction of using some model-based FDI techniques often lies with the inherent demandfor intensive computing power for modelling and simulation. These computation requirements alsolimit its application on large-scale complex systems. Grid technology provides cheaper and easieraccess to the required high-performance, high-throughput computing power to overcome theserestrictions. The distributed computing structure and the organized resource sharing features provideexcellent opportunities to integrate different FDI schemes to obtain better fault diagnostic performance.

3. SOA ON THE GRID FOR FDI INTEGRATION

A SOA for distributed problem solving is not a new concept [8]. Some early SOAs used the DistributedComponent Object Model (DCOM) [9] or Object Request Brokers (ORBs) based on the CommonObject Request Broker Architecture (CORBA) specification [10]. Over the last few decades, softwarestructure has been slowly decoupled. The introduction of the client/server structure removed thedatabase from the fat client. The thin client decoupled the user interface from the business logic.SOAs were proposed to decouple the integration logic from the business logic. Basically, services andSOAs are about designing and building systems using heterogeneous network addressable softwarecomponents. A SOA is thus an architecture made up of components and interconnections that stressesinteroperability and location transparency. With the introduction of the Web services and Grid services,there has been a renewed interest in building ‘virtual organizations’ based on the SOA for distributedproblem solving [11,12].

The technology of Web services is the most likely connection technology of SOAs. Web servicesessentially use XML to create a robust connection. At the core of the Web services model is the notionof a service, which is defined as a collection of operations that carry out some types of tasks. Withinthe context of Web services, there are three components, namely service providers, service requestorsand service brokers. A service is deployed on the Web by the service provider. The functions providedby a given Web service are described using the Web Services Description Language (WSDL) [13] andpublished on the Web. A service broker helps the service provider and service requester find each otherthrough a UDDI (Universal Description, Discovery, and Integration) [14] based registry. A servicerequester uses the standard Application Programming Interface (API) to ask the service broker aboutthe services it needs and then uses SOAP (Simple Object Access Protocol) [15] to invoke the remoteservice provider side applications.

The Grid is a name that was first coined in the mid-1990s to describe a vision for a distributedcomputing infrastructure for advanced science projects. As explained by Foster and Kesselman [16],the Grid should enable ‘resource sharing and coordinated problem solving in dynamic, multi-institutional virtual organizations’. With the first-generation Grid involving ‘Metacomputers’ andsecond-generation Grid focused on middleware and communication protocols, it is now claimed thatthe third-generation Grid is combining SOA concepts and Web services technologies to create theOpen Grid Services Architecture (OGSA) [17]. The Open Grid Services Infrastructure (OGSI) servicespecification is a keystone in implementing this architecture, followed by the recent Web ServiceResource Framework (WSRF) [18], which is a set of six Web services specifications that try to meldWeb services with Grid computing by defining how to model and manage state in a Web servicecontext.

Copyright c© 2006 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2007; 19:223–234DOI: 10.1002/cpe

228 X. REN ET AL.

Figure 1. Integrated fault diagnostics on the Grid.

Defined by OGSA, the Grid service is basically a Web service, which is a set of Internet-baseddistributed processes. Based on standards such as XML, SOAP, WSDL and UDDI, the promise ofGrid services is to enable a distributed environment in which any number of applications or applicationcomponents can interoperate seamlessly among organizations in a platform-neutral, language-neutralfashion on the Grid.

To support the complex system fault diagnostics, a methodology has been proposed to integratesuites of modelling, estimation and analysis tools for fault diagnosis on the Grid. The SOA is adoptedhere to support this integration and the latest development on the Grid has been implemented to meetthe specifications of OGSA and WSRF.

By adopting this open SOA for the integrated fault diagnostics, the most commonly used techniquesfor FDI, which include process modelling and simulation, parameter estimation, state observer, parityrelations, statistic approaches and symbolic approaches, can all be developed individually as faultdiagnostic services. As illustrated in Figure 1, these services can be created and maintained bydifferent institutions for different fault diagnostic purposes and are all defined through a commonlyaccepted description format, namely the WSDL. Registered with the UDDI registry, these faultdiagnostic services can be discovered and organized in a flexible way. The result is a versatile virtual

Copyright c© 2006 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2007; 19:223–234DOI: 10.1002/cpe

SOA ON THE GRID FOR INTEGRATED FAULT DIAGNOSTICS 229

FDI toolbox on the Grid. Users can access this toolbox by using an Internet browser via a Grid portaland they can select different FDI schemes to fit their own unique requirements.

For dealing with the intensive and distributed nature of the fault diagnostic data, such as in theaero-engine health monitoring domain, a distributed data management architecture has to be adopted.Grid middleware such as the San Diego Supercomputer Center (SDSC) Storage Resource Broker(SRB) [19] provides a uniform interface for connecting to heterogeneous data resources over a networkand accessing replicated data sets on the Grid. Thus, in conjunction with the Metadata Catalogue(MCAT), the SRB can be used in a way that fault diagnostic data sets and resources can be accessedbased on their attributes and/or logical names rather than their names or physical locations. The benefitof using Grid middleware for fault diagnostic data management is not only that remotely distributedmonitoring data can be accessed transparently by different FDI services, but also that it removes theneed for expensive transfer of data to a traditional central data repository.

In this SOA for distributed fault diagnostics, when an individual FDI service is invoked, it can usethe global or local Grid resources to fulfil its commitment. One FDI service can also invoke otherGrid services if necessary. With the help of the fault diagnostic domain service broker and domainworkflow advisor, the most suitable FDI services can be aggregated optimally and dynamically asservice composition to provide best fault diagnosis practice in different fault scenarios. At the lowlevel, the physical Grid infrastructure provides the potential high-performance computing power andlarge-scale data handling capabilities. By adopting the OGSI specification, the proposed architectureis a Grid solution to integrated fault diagnostics, which allows different fault diagnostic and healthmonitoring applications to share algorithms, data and computing resources as well as to access themacross multiple organization in an efficient and secure way.

4. IMPLEMENTATION

The U.K. e-Science pilot project DAME is developing a distributed diagnostics and prognostics systemfor the maintenance of civil aerospace engines. The techniques can be generalized to other diagnosticdomains such as medicine and manufacturing. The DAME system uses Grid technology to demonstratehow remote and diverse applications and services can be linked into a virtual diagnostic environment.Various techniques are employed in the project, which include advanced pattern matching to searchvery large data sets (terabytes), modelling for fault diagnosis and simulation for decision making,case-based advice, workflow management and collaboration environments [20,21].

An effort carried out on the DAME project is the proposed service-oriented architecture implementa-tion for integrated fault diagnostics. A gas turbine engine performance model was firstly provided as aGrid service to facilitate the exploitation of further development and analysis of different model-basedFDI approaches. Figure 2 illustrates a running scenario of this Grid service through a Web portal.

Figure 3 shows one basic usage of the engine simulation Grid service for fault diagnosis. When anaccurate system performance simulation is available on the Grid, the experienced maintenanceengineers can invoke this simulation against the real monitored process data. The system that is beinganalysed is compared against the simulation results. The differences between the current state of theengine and the ideal model generate residuals. These residuals then need to be intelligently analysedto form a decision about the current state of the engine. This can be used to track changes in engineparameters that may indicate impending faults.

The advantages of providing an engine performance simulation as a Grid service is that the enginesimulation service is identified by a Universal Resource Identifier (URI), whose public interfaces and

Copyright c© 2006 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2007; 19:223–234DOI: 10.1002/cpe

230 X. REN ET AL.

Figure 2. Engine simulation Grid service.

inputGas Turbine Engine

u(t)output

y(t)

z(t)�

residualEngine Model F2(z, y)�

Figure 3. Simulation-based fault diagnosis.

Copyright c© 2006 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2007; 19:223–234DOI: 10.1002/cpe

SOA ON THE GRID FOR INTEGRATED FAULT DIAGNOSTICS 231

Observation Processing(Event History Generation)

Engine Performance Model

Combination(CBR or Other)

Raw Measurement Fault Diagnosis,Maintenance Advice

Figure 4. Event generation and analysis.

bindings are defined and described using XML. Authorized users can perform the engine performancesimulation through a Web browser remotely without knowing details of the execution of the simulation.The simulation service itself is distributed among a set of high-performance computers on the Grid.Based on the Globus Toolkit 3 [22], this engine simulation Grid service can be invoked simultaneouslyin different ‘virtual organizations’ for different applications. The usage and management of theGrid resources are made through the Globus middleware and are transparent to users. Through itspublic interface, authorized developers can also invoke this service to develop their own applications.The factory service can generate distinct engine simulation instances for different client requirementsat the same time and the security is enhanced by implementing both the Grid service message levelsecurity and the Secure Sockets Layer (SSL) two-way authentication.

The engine simulation Grid service has also been used in the event generation and analysis forengine fault diagnosis and maintenance. As illustrated in Figure 4, there are two stages for this work.In the observation processing stage, raw measurement of different engine performance variables andengine simulation results as inputs are analysed. A change detection method is used to characterizethe input time series. The goal is to recognize changes that are important in the context of engineperformance behaviour that correspond to engine faults. This process has two aspects: segmenting datain a meaningful way and extracting features that are useful about whether the engine is exhibitingnormal or abnormal behaviour. In the combination stage, two event sequences from both the rawmeasurement and the engine simulation are compared, with any discrepancy indicating a possible fault.

The advantage of introducing a separate event history based on the engine simulation is thata reference of healthy engine history is provided to assist the fault diagnostic decision making.The comparison of two discrete event histories instead of the original binary time series data can helpto overcome the model uncertainty and unmodelled noise. As a result, the robustness of fault diagnosisis improved.

Another effort of the DAME project on the integration of fault diagnostics on the Grid is the useof Case-Based Reasoning (CBR) [23] services to correlate and integrate fault indicators from differentaero engine input monitoring systems, Built-In Test Equipment (BITE) reports, maintenance data anddialogue with maintenance personnel to allow the troubleshooting of faults. As a qualitative faultdiagnosis approach, CBR is a problem-solving paradigm that resolves new problems by adapting the

Copyright c© 2006 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2007; 19:223–234DOI: 10.1002/cpe

232 X. REN ET AL.

Figure 5. Case-based reasoning Grid services.

solutions used to solve problems of a similar nature experienced in the past. A further advantage ofthis approach is that it allows the consolidation of rule knowledge and provides a reasoning enginethat is capable of probabilistic-based matching. With CBR technology, development can take placein an incremental fashion facilitating rapid prototyping of an initial system. The development ofrobust strategies for integration of multiple health information sources can be achieved with reasoningalgorithms of progressively increasing complexity.

Also deployed as Grid services on the Grid environment, the CBR services can be invoked byother authorized Grid services or maintenance analyses to perform the high-level fault diagnosticsand decision making, as illustrated in Figure 5. The advantage of deploying CBR in Grid servicesis that maintenance personnel can access a secured connection to the service via a Web browser onany computer connected to the Internet. The maintenance personnel will then have access to storesof accumulated diagnostic knowledge and maintenance data as well as large computing resources tosupport the fault analysis and the decision-making process. Other fault diagnostic services can be usedto perform the preliminary fault diagnosis and the results can be used to facilitate the CBR analysis.As standard Grid services, the CBR services can be invoked to produce useful outcomes that areprofitable to other decision-making services as well. In the future, the CBR services can be upgraded toaccommodate a dynamic learning process. Anomalous data (data containing unknown faults) may beanalysed in DAME to produce new fault cases that are dynamically appended to the casebase, furtherincreasing the knowledge of the system.

Copyright c© 2006 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2007; 19:223–234DOI: 10.1002/cpe

SOA ON THE GRID FOR INTEGRATED FAULT DIAGNOSTICS 233

A typical use case that encompasses both the engine simulation and CBR services in the faultanalysis and maintenance process is described as follows. Data downloaded from an aircraft is firstanalysed for novelties (known fault occurrences). The existence of a fault and the possible fault typecan be checked against the engine simulation. If a novelty exists, then further information is extractedfrom the data and other available fault diagnostic services to form a query within the CBR services.The result returned to the maintenance personnel consists of previous similar fault cases, knownsolutions to the problem, as well as a confidence ranking for each case. The maintenance analysesand domain experts can further take advantage of the integrated fault diagnostic tools to confirm thefault diagnosis findings. For example, the domain experts can substantiate a proposed fault analysis byinjecting the similar fault into an engine model and perform a simulation to check the consistency.

By using the Grid technologies and the proposed SOA, more modelling, simulation and decision-making services from different providers or institutes can be shared, coordinated and integrated forfault diagnostics, prognostics and engine maintenance.

5. CONCLUDING REMARKS

In this paper, the use of SOA and Grid technologies to support the distributed fault diagnosis ofcomplex systems was discussed. Different FDI schemes have been summarized and the importanceof integrated diagnostics has been emphasized. An open framework based on the OGSA has beenproposed and demonstrated on the DAME project to address the aftermarket requirements of the aero-engine industry. The business benefits of this open, flexible approach to integrated fault diagnosticsnot only include improved fault diagnosis performance but also reusable service assembly, bettermaintainability, better parallelism in development, higher availability and better scalability.

ACKNOWLEDGEMENTS

This work and the DAME project are supported by the Grant Number GR/R67668/01 from the Engineering andPhysical Research Council (EPSRC) in the U.K. and through contributions from Rolls-Royce plc and the DataSystems & Solutions. The authors would also like to thank the anonymous referees for their valuable suggestionson improving the presentation.

REFERENCES

1. Isermann R. Supervison, fault-detection and fault diagnosis methods—an introduction. Control Engineering Practice 1997;5(4):639–652.

2. Patton RJ. Robust model-based fault diagnosis: The state of the art. Proceedings of the IFAC Symposium on Fault Detection,Supervision and Safety for Technical Processes, Espoo, Finland, 1994. Pergamon: Oxford, 1994; 1–24.

3. Gertler J. Fault Detection and Diagnosis in Engineering Systems. Marcel Dekker: New York, 1998.4. Patton RJ, Uppal FJ, Lopez-Toribio CJ. Soft computing approaches to fault diagnosis for dynamic systems: A survey.

Proceedings of the 4th IFAC Symposium on Fault Detection Supervision and Safety for Technical Processes, Budapest,Hungary, June 2000. Pergamon: Oxford, 2000; 198–211.

5. Isermann R, Balle P. Trends in the application of model-based fault detection and diagnosis of technical processes. ControlEngineering Practice 1997; 5(5):709–719.

6. Chen J, Patton RJ. Robust Model-based Fault Diagnosis for Dynamic Systems. Kluwer: Boston, MA, 1999.

Copyright c© 2006 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2007; 19:223–234DOI: 10.1002/cpe

234 X. REN ET AL.

7. Li YG. Performance-analysis-based gas turbine diagnostics: A review. Proceedings of the Institute of MechanicalEngineers, Part A 2002; 216:363–377.

8. Walker DW, Li M, Rana OF, Shields MS, Huang Y. The software architecture of a distributed problem-solvingenvironment. Concurrency: Practice and Experience 2000; 12:1455–1480.

9. Microsoft. Distributed Component Object Model, 2003.http://www.microsoft.com/com/tech/DCOM.asp [10 December 2004].

10. OMG. Object Management Group, 2003. http://www.omg.org [10 December 2004].11. Foster I, Kesselman C, Nick JM, Tuecke S. The physiology of the Grid: An open services architecture for distributed

systems integration, 2002. http://www.ggf.org [10 December 2004].12. Rana OF, Walker DW. Service design patterns for computational Grids. Patterns and Skeletons for Parallel and Distributed

Computing, ch. 1, Rabhi FA, Gorlatch S (eds.). Springer: Berlin, 2002.13. WSDL. The Web Services Description Language, 2003. http://www.w3.org/TR/wsdl [10 December 2004].14. UDDI. The Universal Description, Discovery and Integration Protocol, 2003. http://www.uddi.org/ [10 December 2004].15. SOAP. The Simple Object Access Protocol, 2003. http://www.w3.org/TR/2000/NOTE-SOAP-20000508/

[10 December 2004].16. Foster I, Kesselman C. The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann: San Francisco, CA,

1999.17. OGSA. The Open Grid Service Architecture, 2004. http://www.globus.org/ogsa/ [10 December 2004].18. WSRF. The Web Services Resource Framework, 2004. http://www.globus.org/wsrf/ [10 December 2004].19. SRB. The Storage Resource Broker, 2004. http://www.npaci.edu/DICE/SRB/ [10 December 2004].20. DAME. Distributed Aircraft Maintenance Environment Project, 2003.

http://www.cs.york.ac.uk/dame/ [10 December 2004].21. Jackson T et al. Distributed health monitoring for aero-engines on the Grid: DAME. Proceedings of the 2005 IEEE

Aerospace Conference, Big Sky, MT, March 2005. IEEE Press: Piscataway, NJ, 2005 (CD-ROM).22. Globus. Globus Alliance, 2003. http://www.globus.org [10 December 2004].23. Kolodner J. Case-based Reasoning. Morgan Kaufmann: San Francisco, CA, 1993.

Copyright c© 2006 John Wiley & Sons, Ltd. Concurrency Computat.: Pract. Exper. 2007; 19:223–234DOI: 10.1002/cpe