DISTRIBUTED AERO-ENGINE CONDITION MONITORING AND DIAGNOSIS ON THE GRID: DAME

10
DISTRIBUTED AERO-ENGINE CONDITION MONITORING AND DIAGNOSIS ON THE GRID: DAME Martyn Fletcher, Jim Austin, Tom Jackson. Advanced Computer Architectures Group, Department of Computer Science, University of York. Heslington, York, YO10 5DD, UK. {martyn.fletcher, jim.austin, tom.jackson}@cs.york.ac.uk www.cs.york.ac.uk/dame ABSTRACT The adoption of advanced condition monitoring and diagnosis technology in the aero-engine domain has been shown to offer significant benefits. The diagnosis and prognosis of aero-engine conditions would be further enhanced through the use of a diagnosis infrastructure of remote services, tools and human experts. The Distributed Aircraft Maintenance Environment (DAME) provides a Grid-based collaborative and interactive workbench that can support remote analysis of vibration and performance data by various geographically dispersed users: Maintenance Engineers, Maintenance Analysts and Domain Experts. The diagnosis environment is built around a workflow system, and an extensive set of data analysis tools, which can provide automated diagnosis for known conditions. Where automated diagnosis is not possible DAME provides remote experts with a collaborative and interactive diagnosis and analysis environment. The DAME project has produced a proof of concept demonstrator which addresses the requirements for a virtual organisation, the Grid-enabled analysis tools required for the distributed diagnosis and prognosis activity and the data architecture required to manage the vast, distributed, homogeneous data repositories of engine health-monitoring data. Future work will look towards the deployment of a DAME diagnostic and prognosis system in the aero-engine domain and towards its use in other domains. KEYWORDS Aero-engine, condition monitoring, health monitoring, diagnosis, AURA, fast terabyte data searching, virtual organisation, diagnosis tool bench, Grid, workflow, data mining, case based reasoning, vibration data. INTRODUCTION AND MOTIVATION Modern aero engines operate in highly demanding operational environments with extremely high reliability. However, Data Systems & Solutions and Rolls Royce have shown that the adoption of

Transcript of DISTRIBUTED AERO-ENGINE CONDITION MONITORING AND DIAGNOSIS ON THE GRID: DAME

DISTRIBUTED AERO-ENGINE CONDITION MONITORING AND DIAGNOSIS ON THE

GRID: DAME

Martyn Fletcher, Jim Austin, Tom Jackson.

Advanced Computer Architectures Group, Department of Computer Science, University of York. Heslington, York, YO10 5DD, UK.

{martyn.fletcher, jim.austin, tom.jackson}@cs.york.ac.uk www.cs.york.ac.uk/dame

ABSTRACT The adoption of advanced condition monitoring and diagnosis technology in the aero-engine domain has been shown to offer significant benefits. The diagnosis and prognosis of aero-engine conditions would be further enhanced through the use of a diagnosis infrastructure of remote services, tools and human experts. The Distributed Aircraft Maintenance Environment (DAME) provides a Grid-based collaborative and interactive workbench that can support remote analysis of vibration and performance data by various geographically dispersed users: Maintenance Engineers, Maintenance Analysts and Domain Experts. The diagnosis environment is built around a workflow system, and an extensive set of data analysis tools, which can provide automated diagnosis for known conditions. Where automated diagnosis is not possible DAME provides remote experts with a collaborative and interactive diagnosis and analysis environment. The DAME project has produced a proof of concept demonstrator which addresses the requirements for a virtual organisation, the Grid-enabled analysis tools required for the distributed diagnosis and prognosis activity and the data architecture required to manage the vast, distributed, homogeneous data repositories of engine health-monitoring data. Future work will look towards the deployment of a DAME diagnostic and prognosis system in the aero-engine domain and towards its use in other domains. KEYWORDS Aero-engine, condition monitoring, health monitoring, diagnosis, AURA, fast terabyte data searching, virtual organisation, diagnosis tool bench, Grid, workflow, data mining, case based reasoning, vibration data. INTRODUCTION AND MOTIVATION Modern aero engines operate in highly demanding operational environments with extremely high reliability. However, Data Systems & Solutions and Rolls Royce have shown that the adoption of

advanced engine condition monitoring and diagnosis technology can reduce costs and flight delays through enhanced maintenance planning [1]. Such aspects are increasingly important to aircraft and engine suppliers where business models are based on Fleet Hour Agreements (FHA) and Total Care Packages (TCP). Rolls-Royce has collaborated with Oxford University in the development of an advanced on-wing monitoring system called QUICK [2]. QUICK performs analysis of data derived from continuous monitoring of broadband engine vibration for individual engines. Known conditions and situations can be determined automatically by QUICK and its associated Ground Support System (GSS). Less well-known conditions (e.g. very early manifestations of problems) require the assistance of a remote expert (Maintenance Analysts and Domain Experts) to interpret and analyse the situation. The remote expert may want to consider and review the current data, search and review historical data in detail and run various tools including simulations and signal processing tools in order to evaluate the situation. Without a supporting diagnostic infrastructure this can be problematic because the data, services and experts are usually geographically dispersed and advanced technologies are required to manage and search the massive data sets. Each aircraft flight can produce up to 1 Gigabyte of data per engine, which, when scaled to the fleet level, represents a collection rate of the order of Terabytes of data per year. The storage of this data also requires vast data repositories that may be distributed across many geographic and operational boundaries. This scenario is also typical of many other domains, for example healthcare. Therefore, there are advantages in providing a diagnostic infrastructure to allow enhanced condition monitoring and diagnosis. The diagnostic process includes:

• Maintenance Engineers at airports who perform checks and tests and minor maintenance. The Maintenance Engineer uses the automated features of DAME and situations that cannot be resolved here are referred to the appropriate remote Maintenance Analyst.

• Remote Maintenance Analysts who have expertise on the maintenance and logistics planning and some engine expertise. The Maintenance Analyst uses the DAME tools interactively and situations that cannot be resolved here are referred to the appropriate remote Domain Expert.

• Remote Domain Experts who use the DAME tools interactively and have extensive expertise for a particular type of engine.

• The DAME tools and services, which are used to provide diagnosis and prognosis of engine conditions.

• The Maintenance and Repair Organisation undertake major maintenance e.g. engine overhaul. The requirements of such an infrastructure are:

• The diagnostic processes require collaboration between Maintenance Engineers, Maintenance Analysts and Domain Experts from different organisations: airline, support contractors and engine manufacturer. These individuals are geographically dispersed and need to deploy a range of different engineering and computational tools to analyse the problem.

• To allow appropriate access by users to the data from the engine under consideration. • To allow appropriate access by users to the historical data from the engine under consideration

and other similar engines. • The ability to search the vast stores of historical vibration and performance data. Advanced

pattern matching and data mining methods are required to search for matches to novel features detected in the vibration data. These methods must be able to operate on the large volumes of data and have a response time that meets operational demands. This allows the diagnostic information found on one engine in one place to be reused in another place.

• Provide case based reasoning techniques. • Provide signal processing and engine simulation tools using data from the engine or historical

data. • Create, edit and execute diagnostic workflows. • The diagnostic process must be completed in a timely and dependable manner commensurate

with the turn round times of the aircraft.

The requirements for DAME have been captured and developed via Use Case analysis with the industrial partners: Rolls-Royce and Data Systems & Solutions (DS&S). The following sections describe a typical diagnostic process followed by a description of the DAME diagnostic infrastructure and some of the main tools. THE TYPICAL DIAGNOSTIC PROCESS The QUICK airborne system and its associated ground based system are used prior to the use of DAME. In addition to the existing diagnosis provided by QUICK, DAME is always used to provide an automated diagnosis. This is desirable because DAME can detect additional situations, for example:

• A recurring errant diagnosis. • A new condition that has not been yet been uploaded to the airborne QUICK. • A condition that can only be detected using tools that require extensive (ground based)

processing facilities. The resultant automatic diagnoses (from QUICK and from DAME) can then be assessed. In the vast majority of cases normal situations are indicated, however, if a condition is detected with a known cause then appropriate maintenance action can be planned. Additionally, in the rare case that a condition is detected without a clear cause then the situation will be “escalated” to one of various remote experts who can look into the matter further. The Maintenance Analysts and Domain Experts have access to the data from the current engine flight, can run searches on historical data, get workflow advice, run signal processing and simulation tasks to gain an insight into any given situation.

QUICK PerformsDiagnosis

Has anyproblem (further

situation)been detected?

Diagnosis

DAME PerformsBrief Diagnosis

Diagnosis

Aircraft ProvidesData / Information

No

DAME /Maintenance

Analyst PerformsDetailed Diagnosis

/ Prognosis

Engine Data Engine Data

Yes

Observations/ Diagnosis

Visual data

Is a MajorOverhaul

Required?

No

Yes

Schedule MajorOverhaul at a

Later Date

No

Is Major Overhaulrequired Immediately?

No

Yes

DAME / DomainExpert

Performs DetailedAnalysis

Yes

No

Is a high certaintydiagnosis available?Yes

Is ahigh certainty

diagnosisavailable?

MaintenanceEngineerPerformsInspection

InformMaintenance

EngineerNo Action

InformMaintenance

Engineerof Minor

MaintenanceAction

InformMaintenance

EngineerMajor Overhaul

RequiredImmediately

InformMaintenance

EngineerNo Action now

Yes

Provide MaintenanceEngineer with Adviceor Request for further

information /investigative action

No

This flow may be performed several timesif, for example, the Maintenance Analystor Domain Expert asks for further information/ investigative action or if the minormaintenance action does not correctthe problem.

Moreinformation

required from theMaintenance

Engineer?

Yes

No

Is ahigh certainty

diagnosisavailable?

Figure 1 Typical Diagnosis Flow Chart.

The interactions between the various users can be complex, particularly in situations where additional tests are requested by the remote experts in order to confirm a diagnosis. An overview of the typical diagnostic scenario including escalation to the remote experts is described below. Figure 1 outlines this process as a flow chart.

1. An aircraft lands and data from on-wing system (QUICK) is automatically downloaded to the associated GSS.

2. The data is then stored in the DAME data storage system. This assumes that the landing site is equipped for DAME. If this is not the case then the QUICK diagnosis is used alone.

3. QUICK and its GSS indicate whether any abnormality (this is a detected condition for which there is a known cause) or novelty (this is a detected deviation from normality for which there is currently no known cause) has been detected.

4. DAME executes an automatic workflow to determine its diagnosis (see figure 1 DAME Performs Brief Diagnosis). This is a standard pre-programmed diagnostic sequence (set up by a Domain Expert) including:

a. Run the ground based signal-processing tool on the new data. b. Automatically assess the output and determine regions of interest in the data. c. Use any defined region of interest (from step b) as a search query to the pattern

matching service – to look for previous occurrences in historical data. d. The result of the search is fed to a case based reasoning diagnosis system, which

determines the most likely diagnosis based on previous knowledge. This is the automatic diagnosis provided by DAME.

5. Depending on the result of the QUICK and DAME automatic diagnoses there are three outcomes:

a. Everything is normal – the engine is ready for the next flight. b. A condition, which has a known cause, has been detected. This can be resolved by

immediate maintenance action or planned for future maintenance action, as appropriate. c. A condition, which currently does not have a clear cause, has been detected or there is

some ambiguity about the cause. This case is referred to the remote experts for consideration.

6. The first stage in the DAME escalation process is consideration by a Maintenance Analyst (see figure 1 DAME Maintenance Analyst Performs Detailed Diagnosis / Prognosis) who will use the tools available within DAME to consider the condition. If the Maintenance Analyst is able to resolve the situation he will provide advice to the Maintenance Engineer. Alternatively, if he is unable to provide a confident diagnosis then he will escalate it to the Domain Expert.

7. The next stage in the DAME escalation process is consideration by a Domain Expert (see figure 1 DAME Domain Expert Performs Detailed Analysis) who will use the tools available within DAME to consider the condition. He will provide advice to the Maintenance Analyst and Maintenance Engineer.

For stages 6 and 7 the DAME infrastructure provides the remote Maintenance Analyst and Domain Expert with the following facilities:

a. A sophisticated interactive data viewer in order to allow exploration and viewing of the particular regions of interest in engine vibration data from this flight and from the historical fleet archives.

b. A high performance pattern matching facility to allow searching of historical fleet data archives. This is closely integrated with the data viewer and together they provide a particularly powerful diagnosis aid. The central gain of the use of pattern matching is to allow diagnostic information found on one engine in one place to be reused in another place, where the same symptoms exist.

c. Configurable signal processing tools capable of detecting prescribed conditions. d. A configurable and interactive engine performance simulation to explore the simulation

of various engine situations. e. A case based reasoning tool for providing ranked diagnoses. f. All the tools are closely integrated with a workflow suite, which provides facilities for

diagnostic workflow creation, editing, debugging and execution. This includes the maintenance to the workflow used to provide the initial automated DAME diagnosis and a library of other diagnostic workflows.

g. A case based reasoning tool for providing advice on the workflow to use for varying situations.

h. A collaboration environment for the various users.

The aim is to resolve as many problems as early in the process as possible in order to reduce the demand on the time of the Domain Experts. However, for the rare cases that cannot be resolved automatically the remote experts will become involved. Once a diagnosis has been made this knowledge can be fed back into system, for example, into the cased based reasoning models. This is vital so that future occurrences of presently unknown conditions can eventually be detected automatically without remote expert intervention. Information would also be fed back to the design process in case preventative measures can be taken. DAME provides a high level of confidence that any maintenance action can be planned with minimal cost and flight disruption to the operator. THE DAME DIAGNOSTIC INFRASTRUCTURE The DAME diagnostic infrastructure is a Grid [3, 4] based environment where users in different organisations and locations can work together and use a variety of tools and processes to determine a diagnosis. The infrastructure has been demonstrated for the aero-engine domain on the White Rose Grid [5] but may be equally applicable to many other industrial areas and healthcare. The diagnostic infrastructure requires the use of Grid technology because:

• The volume of engine data to be downloaded requires the use of a high bandwidth connection (the Grid) so that it can be presented to DAME applications in a timely manner.

• The engine data download and Grid execution of the DAME diagnosis (by Maintenance Engineer, Maintenance Analyst and Domain Expert) must occur within the turnaround time of the aircraft.

• The Maintenance Engineer, Maintenance Analyst and Domain Expert come from different organisations and are located in different parts of the world – a Virtual Organisation (VO).

The diagnosis infrastructure and tools are the result of technology developments made prior to and during the DAME project. The following sub-sections provide overviews of the data, the innovative data viewer and search facility together with a brief overview of the other services and the distributed architecture. Much of the diagnostic infrastructure can be used easily in other domain areas. Many of the core services are generic, but are conditioned for use in the specific domain by the data used. The Data DAME uses performance and broadband vibration time-series data captured from engine sensors during flight. A time-series is a sequence containing the values of a variable over time. Typically the variable may be a sensor reading for example temperature, pressure or fuel flow, taken at constant intervals of time. Also, from the broadband vibration data various components can be extracted for analysis, for example, harmonics (tracked orders) of the various shaft speeds. As an engine accelerates, the rotation speed of the shafts increases and so does the frequency of the vibrations caused by the shafts. (A tracked order is the amplitude of the vibration signal in a narrow frequency band centred on a harmonic of the rotation frequency of a shaft as it changes frequency.) There are usually some harmonics present; though most of the energy in the vibration spectrum is concentrated in the fundamental tracked orders, these constitute the “vibration signature” of the engine. Departures from the normal or expected shapes of these tracked orders provide very useful diagnostic information. Certain conditions or pre-cursors to conditions are manifested in specific tracked orders. For example, evidence of a bird strike can be detected by looking for a specific pattern in a specific harmonic (and by detecting other subtle changes in certain performance parameters). Time-series data can be analyzed in the time or frequency domains and the monitoring and fast searching for similar patterns is applicable to many domains. Many systems and processes within industry, healthcare, business and research can be characterized by sets of time-series data.

The Signal Data Explorer and High Performance Pattern Matcher The Signal Data Explorer (SDE) is a sophisticated visualisation tool, which will allow the remote Domain Expert to view and examine the vibration data. The Domain Expert can extract components of his choosing from the broadband data and view them. Any regions of interest (patterns) can be selected and submitted to the high performance pattern matching service, which can search the entire historical fleet database for similar occurrences. The occurrences of similar patterns are returned and the Domain Expert can view these and link them to defined cases and a previous diagnosis. This can provide support or otherwise for any current diagnosis. The SDE is a very powerful tool for assisting the Domain Expert in providing early detection and diagnosis of deviations from normal engine behaviour. Insight can be gained from viewing and searching such data for the presence of patterns known to be associated with certain conditions and for deviations from a model of normal operation (novelty detection). In the process of diagnosing a condition a Domain Expert can see if a region of interest has occurred before by searching for any previous occurrences through the historical data for that engine and for other engines. The SDE provides:

• A Graphical User Interface integrates the input and output to and from all other remote services with the purpose of providing facilities to explore the data. Data can be “played” forwards, backwards, speeded up, slowed down, stopped, etc.

• A signal extractor, which allows the extraction of signal components from the broadband engine vibration data. Such components can be directly displayed or fragments of these can be selected for use as a query to the AURA search engine.

• A programmable filter module that provides signal pre-processing for purposes such as noise suppression.

• A pattern template library of regions of interest. The built-in pattern template library stores and manages the pattern templates; these might be examples of conditions or examples of unknown events. Templates can be used separately or organized as a set of features that together represent a particular condition.

• A toolbox providing data probing and scaling capabilities. The probing tools allow the user to point to areas of the data display and view information about that area. The information can be frequency spectra or data source information as required. The scaling facility allows the user to rescale the various display axes.

• In addition to the provision of an integrated environment for interactive pattern matching, feature extraction and detection, the SDE also provides programmable facilities. A Domain Expert can use the Signal Data Explorer to perform a programmed set of parallel and sequential operations to search for multiple patterns in the course of diagnosis. This is a very powerful feature.

Figure 2 shows the SDE graphical user interface, annotated to show the various parts.

Message window

AURA Results

Window

Extraction and Control

Figure 2 SDE Graphical User Interface. The high performance pattern matching facility is provided by AURA (Advanced UncertaArchitecture [6]). AURA provides for very fast searching operations on extremely large d> 1Tb) using a simple, analysable and scaleable form of a Neural Network known as a CorMatrix Memory (CMM). The AURA Encoder encodes low-level features into a binary veappropriate for input to the AURA search engine. Data is stored within AURA in the formvectors. AURA therefore requires two phases of operation: storage (or training) and searcThe SDE uses AURA in the search mode i.e. after it has been trained. This is a vital servicDAME to permit high-speed data mining and pattern recognition to be applied to the vast harchives of data for regions of interest selected automatically or by the Domain Expert usinData Explorer. AURA is a generic Grid service that is also used to provide the automated prior to the involvement of a remote expert. In this case the search pattern is provided autoThe major gain offered to the DAME system by this tool is in enabling engineers to compavibration behaviour detected by QUICK with data collected from all engines on all previousearching the historical fleet archives. By identifying the best matches to novel data and that the subsequent behaviour of the engines exhibiting matching vibration patterns, it is posreason about probable causes. It also enables diagnostic information for a fault detected onin one place to be identified and re-used in another place for the same detected fault. Alterregard to maintenance and prognosis, early detection of small changes in the engine’s charcan lead to efficient scheduling of maintenance requirements. Other DAME Services The diagnostic facilities and tools provided by the DAME infrastructure include:

• Case Based Reasoning for Diagnosis: this is a generic Grid service that can be usedrange of diagnostic domains. It is used to suggest the most probable diagnosis for aproblem, based on stored cases to provide the automated diagnosis and is also usedremote experts as desired.

Broadband Data Display Window

Tracked Order Display with selected pattern

in Reasoning ata sets (i.e. relation ctor of binary h (or recall). e within istorical g the Signal diagnosis matically. re unusual s flights by en looking

sible to one engine natively, in acteristics

in a wide given

by the

• Additional Signal Processing Tools: this is a specific aero-engine Grid service used to provide useful diagnostic information through feature detection.

• Engine Simulation: This will allow the simulation of fault, normal, etc. situations and is an existing engine simulation provided as a Grid service.

• Workflow Manager: the workflow system provides for workflow (sequence of services) definition and execution. Workflow Management (WFM) is a fast evolving technology, which is increasingly being exploited by organisations in a variety of industries. Its primary characteristic is the automation of processes involving combinations of human and machine-based activities, particularly those involving interaction with computer-based applications and tools. In DAME the workflow manager is used to create, edit, debug, run specific workflows and maintain the workflow library.

• Case Based Reasoning for Workflow Advice: this is a generic Grid service that provides advice on the diagnostic workflow to use in order to isolate a particular problem (for example, when multiple and equally probable diagnoses are possible). It will also be used to suggest diagnostic workflows to use when new problems are identified. This will be closely integrated with the generic workflow system. Based on previous workflows, a Maintenance Analyst or Domain Expert can obtain advice on which set of tools to use, and more importantly how to use them. These tools can then be composed into a workflow process and sent off for execution on the distributed Grid resources.

The Distributed Architecture When DAME is deployed it is expected to use a distributed data architecture. This is due to the large volumes of data and the fact that some services consume and produce large amounts of data. It is anticipated that all data that arrives at an airport will be stored at that location and all processing and search queries against that data are processed at that site. This results in data relating to any one engine being spread around the airports that it has visited. Typically, a Domain Expert may be required to analyse data from a specific engine. The Domain Expert will use an SDE instance and identify the engine, flight, etc. and then the DAME infrastructure will transparently identify and handle the retrieval of the data from the appropriate airport node to the SDE. The Domain Expert can then view and analyse the data and may identify a region of interest and submit it for searching. The DAME infrastructure includes distributed Pattern Match Controllers (PMCs), which accept search requests from various sources, including in this case, an SDE instance. A PMC will manage the distributed search activity and is responsible for selecting which airport nodes contain data that must be searched against (in many cases this may be all airports). The PMC replicates the search request to the search services at each airport node. All nodes processing a search request perform their search in parallel, searching all engine data held across the system as required. The PMC is also responsible for correlating the results and returning them to the Domain Expert via the SDE as if it were a single search. CONCLUSIONS The aim of DAME is to contribute to the enhanced diagnosis and prognosis of engine problems through the use of remote services, tools and human experts. The DAME project has built a proof of concept demonstrator which addresses the requirements for a virtual organisation, the Grid-enabled analysis tools required for the distributed diagnosis and prognosis activity and the data architecture required to manage the vast, distributed, homogeneous data repositories of engine health-monitoring data. This demonstrator illustrates how the Grid could be used to maintain and manage fleet-wide repositories of aero-engine data with the goal of providing an enhanced ability to anticipate maintenance requirements and monitor engine health.

In the aero-engine application enhanced diagnosis / prognosis will enable more detailed planning of maintenance activities and logistics, leading to:

• Reduction in disruption costs i.e. activities necessary to accommodate unplanned maintenance. • Reduction in maintenance costs i.e. an early preventative repair usually costs less than a repair

when failure has actually occurred. • Increased customer (airline) satisfaction through reduced operating and disruption costs.

The DAME diagnosis infrastructure and many of tools could be used in a wide range of other diagnostic domains. Future work will look towards the deployment of a DAME diagnostic and prognosis system in the aero-engine domain and towards its use in other domains. ACKNOWLEDGEMENTS The work reported in this paper was developed and undertaken by the DAME team at the Universities of York, Leeds, Sheffield and Oxford with grateful assistance from Rolls-Royce, Data Systems & Solutions and Cybula Ltd. This research was supported by Grant Number GR/R67668/01 from the Engineering and Physical Research Council in the UK and through contributions from Rolls-Royce plc and Data Systems and Solutions, LLC. REFERENCES [1] http://www.ds-s.com/corecontrol.asp [2] Nairac A, Townsend N, Carr R, King S, Cowley P, Tarassenko L. A system for the analysis of jet engine vibration data. Integrated Computer-Aided Engineering, 53-65, 1999. [3] The Grid: 2nd Edition, edited by Ian Foster and Carl Kesselman. MKP/Elsevier , Oct 2003. [4] Austin, Jackson, et al, Chapter 5, Predictive Maintenance: Distributed Aircraft Engine Diagnostics, in The Grid: 2nd Edition, edited by Ian Foster and Carl Kesselman. MKP/Elsevier , Oct 2003. [5] White Rose Grid http://www.wrgrid.org.uk/index.html. [6] www.cs.york.ac.uk/aura , The AURA and AURA-G web site, Advanced Computer Architectures Group, University of York, UK.