A Pattern-Based Framework for Software Anomaly Detection

22
Software Quality Journal, 12, 99–120, 2004 2004 Kluwer Academic Publishers. Manufactured in The Netherlands. Invited Paper A Pattern-Based Framework for Software Anomaly Detection S.C. KOTHARI ∗∗ , LUKE BISHOP and JEREMIAS SAUCEDA ∗∗∗ [email protected] Department of Electrical and Computer Engineering, Iowa State University, Ames, IA 50011 GARY DAUGHERTY Advanced Technology Center, Rockwell Collins, Cedar Rapids, IA Abstract. This paper presents a pattern-based framework for developing tool support to detect software anom- alies. The use of a pattern-based approach is important because it provides the flexibility needed to address domain-specific needs, with respect to the types of problems the tools detect and the strategies used to inspect and adapt the code. Patterns can be used to detect a variety of problems, ranging from simple syntactic issues to difficult semantic problems requiring global analysis. Patterns can also be used to describe transformations of the software, used to rectify problems detected through software inspection, and to support interactive inspection and adaptation when full automation is impractical. This paper describes a part of the Knowledge Centric Software (KCS) framework that embodies the pattern-based approach and provides capabilities for addressing different languages and different application domains. While only the part of the framework relevant to code inspections is addressed in this paper, in future, we also expect to address UML analysis and design models. As an application of the research, we present an overview of an inspection tool being developed for high assurance software for avionics systems. Keywords: software inspection, software tools framework, pattern specification language, high assurance soft- ware, safety-critical avionics systems 1. Introduction With an ever-growing reliance on software as a critical infrastructure for medical, en- ergy, transportation, and financial systems, the quality and reliability of software is a prime concern. It is especially so in safety critical applications, such as avionics and medical systems where there is a danger of loss of life. Inspection (Fagan, 1976; Parnas and Lawford, 2003; Aurum, Petersson, and Wahlin, 2002; Gilb and Graham, 1993; Parnas, 1994; Laitenberger, 2002; Anderson et al., 2003; Cscope homepage, 2004; Reasoning Inc., 2004; Viega, 2004; Features: Code inspection, 2004; Kamperman, 2004) and testing are widely used techniques for soft- ware quality control. Creating and executing test cases, however, is often very time consuming and expensive. Performing tests to cover all possible execution paths is also infeasible for real-world applications. This has led to the definition of various test coverage criteria, ranging from simple statement coverage, to decision coverage, to The application of this research to high assurance avionics systems has been funded under contract F33615- 00-C-1624 of the DARPA Software Enabled Control (SEC) program. ∗∗ Contact author. ∗∗∗ Kothari and Sauceda are also affiliated with EnSoft Corp. in Ames, Iowa.

Transcript of A Pattern-Based Framework for Software Anomaly Detection

Software Quality Journal, 12, 99–120, 2004 2004 Kluwer Academic Publishers. Manufactured in The Netherlands.

Invited Paper

A Pattern-Based Framework for Software AnomalyDetection ∗

S.C. KOTHARI ∗∗, LUKE BISHOP and JEREMIAS SAUCEDA ∗∗∗ [email protected] of Electrical and Computer Engineering, Iowa State University, Ames, IA 50011

GARY DAUGHERTYAdvanced Technology Center, Rockwell Collins, Cedar Rapids, IA

Abstract. This paper presents a pattern-based framework for developing tool support to detect software anom-alies. The use of a pattern-based approach is important because it provides the flexibility needed to addressdomain-specific needs, with respect to the types of problems the tools detect and the strategies used to inspectand adapt the code. Patterns can be used to detect a variety of problems, ranging from simple syntactic issues todifficult semantic problems requiring global analysis. Patterns can also be used to describe transformations of thesoftware, used to rectify problems detected through software inspection, and to support interactive inspection andadaptation when full automation is impractical. This paper describes a part of the Knowledge Centric Software(KCS) framework that embodies the pattern-based approach and provides capabilities for addressing differentlanguages and different application domains. While only the part of the framework relevant to code inspections isaddressed in this paper, in future, we also expect to address UML analysis and design models. As an applicationof the research, we present an overview of an inspection tool being developed for high assurance software foravionics systems.

Keywords: software inspection, software tools framework, pattern specification language, high assurance soft-ware, safety-critical avionics systems

1. Introduction

With an ever-growing reliance on software as a critical infrastructure for medical, en-ergy, transportation, and financial systems, the quality and reliability of software is aprime concern. It is especially so in safety critical applications, such as avionics andmedical systems where there is a danger of loss of life.

Inspection (Fagan, 1976; Parnas and Lawford, 2003; Aurum, Petersson, and Wahlin,2002; Gilb and Graham, 1993; Parnas, 1994; Laitenberger, 2002; Anderson et al.,2003; Cscope homepage, 2004; Reasoning Inc., 2004; Viega, 2004; Features: Codeinspection, 2004; Kamperman, 2004) and testing are widely used techniques for soft-ware quality control. Creating and executing test cases, however, is often very timeconsuming and expensive. Performing tests to cover all possible execution paths isalso infeasible for real-world applications. This has led to the definition of various testcoverage criteria, ranging from simple statement coverage, to decision coverage, to

∗ The application of this research to high assurance avionics systems has been funded under contract F33615-00-C-1624 of the DARPA Software Enabled Control (SEC) program.

∗∗ Contact author.∗∗∗ Kothari and Sauceda are also affiliated with EnSoft Corp. in Ames, Iowa.

100 KOTHARI ET AL.

Modified Condition/Decision Coverage (Chilenski and Miller, 1994) and its variants.Software inspection provides an alternative and complementary approach. Inspectionsare based on static analysis rather than execution of the code (Software metrics andstatic analysis, 2004; Wagner, 2004), and can identify software defects off-line beforethey cause system failures. Catching defects earlier can save a significant amount oftime and money in commercial applications. As a result, even though testing is stillnecessary to analyze certain types of dynamic behavior, there has been an increasingreliance on inspections in high assurance software systems.

Software inspections may be conducted for different purposes, including certifi-cation, verification, diagnostics, and metric analysis. A number of inspection tools(Anderson et al., 2003; Cscope homepage, 2004; Reasoning Inc., 2004; Viega, 2004;Features: Code inspection, 2004; Kamperman, 2004) exist to support the process. Ingeneral, these tools focus on language-specific programming errors. For example,C tools check the code for uses of un-initialized pointers, uses of the assignment op-erator in if conditions, etc. These tools also have limitations when it comes to ad-dressing the special needs of real-time, high assurance, secure, distributed, and fault-tolerant systems. For example, in a distributed system it is important to perform in-spection to detect potential scenarios for race conditions and deadlocks. Similarly, ina commercial aviation application, it is important that object-oriented features such asdynamic dispatch, polymorphic assignment, and inheritance be used only in mannerconsistent with FAA certification (Software Considerations, 2004; Bishop et al., 2002;OOTIA web site, 2004; ISO Ada Standard, 2004; Multiple Inheritance, 2004). Special-ized inspection capabilities based on domain-specific needs are important in practice.Another important issue is customization. Instead of a checklist being hardwired inthe tool, it should be easy to customize inspection to support evolving requirements,related to changes in the company specific best practices or new certification standards.

At present, many companies have to rely on manual inspections because of a lackof inspection tools that address domain-specific needs. For example, this is the casein the avionics industry, where extensive manual inspections are performed to meethigh assurance software requirements. Manual inspections are time consuming andexpensive. Manual inspection can also be very tedious and error-prone, especiallywhen an inspection requires a global analysis of large body of software.

We are developing a domain-specific tools framework to enable the automation ofmany tedious and time-consuming software evolution and maintenance tasks (Kothari,2002; Mitra and Kothari, 1997). The framework addresses inspection, comprehen-sion, and transformation: the three primary tasks required to support the evolutionand maintenance of large software systems. Our framework includes: (a) an exten-sible common intermediate language (XCIL), (b) an extensible pattern specificationlanguage (XPSL), (c) catalogs of patterns that capture domain-specific knowledge,(d) tool support for the program analysis necessary for inspection, (e) a database repos-itory for storing and querying the analysis results, and (f) an interactive visualizationto view analysis results. Overall we refer to the framework as the Knowledge-CentricSoftware (KCS) framework, and to the tools associated with it as the KCS tools. Thispaper focuses on the part of the KCS framework that is relevant for developing inspec-tion tools.

A PATTERN-BASED FRAMEWORK FOR SOFTWARE ANOMALY DETECTION 101

To support the construction of domain-specific tools, the KCS framework takes apattern-based approach. XPSL, the pattern specification language provides a formalmechanism to represent domain-specific knowledge of the strategies and policies as-sociated with various software evolution and maintenance tasks. This knowledge iscaptured in the form of pattern catalogs. It can be categorized broadly as related eitherto the search or transformation of associated software artifacts. For inspections, we areconcerned only with search problems. Semiformal descriptions, as in the case of com-monly used objected-oriented design patterns (Gamma et al., 1995), are meaningfulfor human developers. However, more formal descriptions of patterns are necessaryto provide automated tool support. We have designed XPSL as a language for theformal specification of patterns. It supports a broader notion of pattern than that de-fined by design patterns (Gamma et al., 1995), clichés (Fiutem et al., 1996), or regularexpressions (Paul and Prakash, 1994). This notion of patterns is intended to captureknowledge, not just about designs but also about other aspects of analyzing and trans-forming software. The KCS inspection tools implement the search operations basedon formal specifications given in XPSL.

XCIL, the common intermediate language, provides language independence. Itis based on the JVM, MS-IL and UML, and includes extensions to cover C andC++ semantics (ISU–Rockwell DARPA SEC project website, 2004; JSIS, 2004;High-Assurance Java Virtual Machine, 2004). JVM is an acronym for Java VirtualMachine, which is the software implementation of a “CPU” designed to run compiledJava code. MS-IL is the intermediate language used in the .NET platform. The UnifiedModeling Language™ (UML) is the industry-standard language for specifying, visu-alizing, constructing, and documenting the artifacts of software systems. There area variety of compilers and modeling tools that support these standards. This makes iteasier to map a multitude of languages to/from XCIL. The analysis is performed on theXCIL representation. This makes it possible to reuse the KCS analyzer componentsacross different programming languages.

We also take advantage of XML, XSLT and XQuery technologies (W3 XML Queryweb site, 2004; XSLT, 2004; Flesca, Furfaro, and Greco, 2002). The XCIL represen-tation of the source code and the results of the analysis are created in XML. By storingthe original source code and the results of the analysis in a common format, it is pos-sible to create an integrated tools support for querying the source code along with theresults produced through program analysis. Also, the use of XML makes it easier toprovide tool support, by building on available XML, XSLT and XQuery tools.

The KCS framework is currently being used to develop an inspection tool for safety-critical avionics software. Using XPSL, Rockwell Collins has created a catalog of pat-terns (Daugherty and Kothari, 2002). The patterns are intended for inspection of highassurance middleware and the applications that use it. Current patterns address issuesrelated to: control flow, data integrity, pointer integrity, synchronization, multiple in-terface inheritance, design by contract and subtyping, OO metrics, dynamic resourceallocation, etc. A complete list of the issues addressed by these patterns and the pat-terns themselves are available from our website (ISU–Rockwell DARPA SEC projectwebsite, 2004).

The paper is organized as follows. Section 2 provides examples used to illustrateour pattern-based approach for performing inspections. Section 3 gives an overview

102 KOTHARI ET AL.

of XCIL, the common intermediate language. Section 4 gives an overview of XPSL,the pattern specification language. Section 5 provides describes formalization usingXPSL. Section 6 gives an overview of the analysis framework. In Section 7, we sum-marize the application of the KCS framework toolset to develop an inspection toolfor safety-critical avionics software systems. In Section 8, we give conclusions anddiscuss opportunities for future research.

2. Pattern-based software inspection: Examples

We give two examples of patterns useful for detecting certain types of problems duringinspection. The patterns used to detect the problems are described informally, but serveas the basis for the formal specifications provided in Section 5. We have intentionallyselected inspection problems that are simple to describe informally, but more difficultto specify formally. Often, the informal description of a pattern glosses over manydifficult issues. In contrast, a formal definition of a pattern must include an exact andexplicit specification of all that must one look for. This makes formal specification ofpatterns and the design of XPSL non-trivial. In Section 5, we revisit the examples in aformal context.

2.1. Test variable reassignment—A pattern for high assurance inspection

This pattern is motivated by a certification requirement for avionics systems. To elim-inate the potential for subtle errors, modification of the loop control and loop condi-tion variables in for loops are forbidden at the level A certification of safety-criticalavionics systems (Software Considerations, 2004). In this case, the pattern might beinformally stated as “look for any for loop in which a test (loop control or the loopcondition) variable are modified inside the body of the loop”.

A number of non-trivial details about what to look for are hidden within this appar-ently simple and informal definition. The problem is that there are endless possibilitiesfor the ways the test variables can be modified. Some examples are shown in code seg-ments given in Figure 1.

It is non-trivial to elegantly formalize into one pattern the description of what tolook for so that all the possible variations of the violations are captured. The problemis all the more difficult when we must consider a variety of different languages.

2.2. Reverse locking—a pattern for inspection of concurrent software

In this case, the objective is to check if the software contains certain types of codingpatterns that may result in deadlock. Figure 2 shows one such pattern. The code ex-ample is taken from Tanenbaum’s book on operating systems (Tanenbaum, 2001). Itinvolves two concurrent functions function_A( ) and function_B( ) witha reverse locking pattern. The down and up operations are performed on binarysemaphores.

A PATTERN-BASED FRAMEWORK FOR SOFTWARE ANOMALY DETECTION 103

Figure 1. Examples of test variable modifications to be formalized as a single pattern.

Figure 2. Example of reverse locking.

Suppose one thread executes down(&resource_1) in function_A( ),while a second thread executes down(&resource_2) in function_B( ). Thiscreates a potential for a deadlock. A deadlock occurs when the threads execute the sec-ond down operation in their respective functions. When that happens, the first threadholds resource_1 and waits for resource_2 while the second thread holds re-source_2 and waits for resource_1. Thus, there is a circular wait that creates adeadlock and neither of the two threads can proceed.

In this case, the problematic programming pattern involves two down (lock) opera-tions that are performed in the reverse order by the concurrent functions. The purposeof the inspection is to look for this “reverse order of locking” pattern.

A number of non-trivial details about what to look for are hidden within this appar-ently simple and informal definition of the pattern. For example:

1. One must look for functions that may be executed concurrently. This involves thefollowing additional details: one must look for the calls that create and launchthreads. For example, the function to be executed by a thread may be specified as aparameter in the thread library call. We may require a points-to analysis if functionpointers (rather than direct names) are being used. Moreover, one must do a controlflow analysis to determine if the execution intervals of threads can overlap.

2. After one has identified concurrent functions, one must look for those functionsthat have “reverse order of locking.” This involves looking for the following: One

104 KOTHARI ET AL.

concurrent function has a down operation on a binary semaphore S1 “followed by”another down operation on a binary semaphore S2, without an up operation on S1“in between” these two down operations. Another concurrent function has a downoperation on a binary semaphore S2 “followed by” another down operation on abinary semaphore S1, without an up operation on S2 “in between” these two downoperations. Formalizing the “followed by” notion is complicated by the fact thatthe down and up operations may be conditional (e.g., enclosed in if statements)and may not be controlled by the same condition. This makes it hard to say whatsequences of downs and ups really occur during execution of the program and whenthey really represent pairs and when they do not. Again, a control flow analysis isrequired here to check the “followed by” and “in between” constraints. Also, it maynot be so simple to identify each pair of lock and unlock operations on the samesemaphore. Pointers could have been used instead of explicit names of semaphores.In general, an inter-procedural dataflow and pointer analysis would be required toidentify the target semaphore on which the lock or unlock operations are beingperformed.

3. Instead of specifying lock calls syntactically, it is preferable to specify those se-mantically in a formal specification. In general we do not want to depend uponspecific class and operation names since these may vary from system to system, forexample, other syntax may have been used instead of down and up.

Based on these seemingly simple examples, it should be apparent that it is non-trivialto formalize an informal description of the problem the pattern is intended to detect.And that is often difficult to look for these patterns in code without tool support. Thesedifficulties are only compounded when we consider patterns that are intended to recog-nize more difficult problems, and when we take into account the number of patternsthat must be applied to fully inspect software as part of the certification process.

3. XCIL: eXtensible Common Intermediate Language

Traditionally programming languages have divided the software research community.For users, these divisions have made it difficult to apply the research results and toolsdeveloped for one language to others. This situation is a particular problem for highassurance software, where a variety of analysis methods are typically required, anda greater emphasis is placed on their use. However, the recent emphasis on virtualmachines as targets for compilation, and the introduction of standards for executableanalysis and design model representations have made it possible to focus on executionmodels that are not language specific.

Taking advantage of this opportunity, we have created XCIL to provide a commonsemantic representation that is needed in order to apply patterns and other transforma-tions at a target language independent level. This helps us achieve the goals of:

• addressing many languages with a single toolset,• a well defined common semantics as a basis for tool development for multiple lan-

guages,• interoperability with UML based tools,

A PATTERN-BASED FRAMEWORK FOR SOFTWARE ANOMALY DETECTION 105

• support for executable requirements and design models.

A semantic representation common to all programming languages is probably toomuch to ask for. As a more modest goal, however, it seems possible to define thesemantics for a family of languages that includes the UML Action Semantics, theJVM, Microsoft’s Common Type System and Intermediate Language (CTS + MS-IL)and the programming languages and graphical notations that map to them.

So why do we need a new language (XCIL)? Why not just use an existing intermedi-ate language or an existing program representation? For example, UML/XMI does notdefine low-level actions (for arithmetic, for Boolean logic, etc.). MS-IL and the JVMdo so, but lack the high level concepts provided by UML for modeling, and need tobe reconciled within a common language definition for us to map to/from them both.Compiler intermediate representations are often proprietary and lack the industry andresearch community support found for open standards such as the JVM, MS-IL andUML. They also tend to lack the formal semantic definitions associated with the VMstandards (JSIS, 2004; High-Assurance Java Virtual Machine, 2004).

To overcome the limitations of these representations we have unified and integratedthe above representations. The key insights are that (a) UML and the VM modelsare complementary, (b) the two VM models are very closely aligned, and (c) theseVM models can be extended to support C and C++ specific features. We have dis-cussed the possible choices including XMI, MS-IL, JVM, and compiler specific inter-mediate language representations for designing XCIL in the XCIL reference documentavailable on web (ISU–Rockwell DARPA SEC project website, 2004). At the model-ing level, XCIL is based on UML’s XML representation, XMI (Object ManagementGroup, 2002a, 2002c). It includes the UML 1.4 action semantics (Object ManagementGroup, 2002c) and some extensions (with regard to the redefinition/renaming of in-herited elements) taken from the UML 2 specifications adopted by OMG. The XMLrepresentation of XCIL and XMI are virtually identical except with respect to theirrepresentation of elements from the action semantics. In terms of the action seman-tics, the XML representation for XCIL is more compact, although it contains the sameinformation.

In summary:

• In general, in XCIL, all structural elements are based on corresponding UML defi-nitions.

• At the action level, XCIL provides high-level definitions corresponding to thosein the UML action semantics. Because the action semantics does not define lowlevel actions, these definitions provide only a high level organization for the lowlevel instructions defined by the JVM and MS-IL—one that allows us to classifythem in a manner consistent with UML. All actions are specified in a precondi-tion/postcondition style, with a list of exceptions thrown.

• In XCIL, all subtyping relationships imply substitutability in strict compliance withthe principles of behavioral subtyping and the Liskov Substitution Principle (LSP).

• In terms of low-level actions, XCIL provides definitions corresponding to JVM andMS-IL instructions. Due to the close correspondence between MS-IL and the JVM,it is usually possible to define most actions in terms of a superclass with seman-tics common to them both. Where the JVM or .NET supports similar concepts

106 KOTHARI ET AL.

(e.g., classes, packages, and assemblies) we take a “least common denominator ap-proach”, defining an XCIL element that provides an invariant common to them all.Where this results in a significant loss of semantic information, subtypes are intro-duced that extend the common definition to provide the original semantics of theJVM, and MS-IL instructions. In accordance with LSP, this means the superclassdefinition must specify a precondition stronger than or equal to the preconditions forboth the MS-IL and JVM instruction subclasses, and a postcondition weaker thanor equal to the postconditions for both the MS-IL and JVM instruction subclasses.

• Where MS-IL defines instructions with no JVM counterpart, these result in theirown XCIL definitions. Conversely JVM instructions with no MS-IL counterpartalso result in their own XCIL definitions.

• Given that the JVM and MS-IL do not support all of the features of C and C++,we have extended XCIL at a low level to include definitions for those elementsthat appear in the compiler-specific AST representations of these languages, but aremissing from the MS-IL and JVM execution models.

• Adopting this approach, XCIL is not an exact match with any of the representationsfrom which it was derived. However, it is “extremely close” to all of them, makingit practical to perform a mapping to any of them with very little effort (e.g., usingXSLT).

For more details, including a number of examples related to the representation ofC++ programs in XCIL, we refer the reader to our web site (ISU–Rockwell DARPASEC project website, 2004).

4. XPSL: eXtensible Pattern Specification Language

XPSL provides a language for the specification of patterns that extends the founda-tion provided by aspect-oriented software development (AOSD) (Kiczales et al., 1997;AspectJ Quick Reference, 2004). XPSL provides a much broader definition of patternand pattern-based transformation founded on program semantics rather than syntax. Itsupports the composition of patterns from component patterns, and permits the defini-tion of domain-specific pattern families.

In AOSD, pointcuts define sets of points (joinpoints) in the execution of a program atwhich code (advice) may be ‘woven’ with the underlying application (Kiczales et al.,1997). This is sufficient to address a number of common problems in which coderepresenting a particular strategy or policy (for synchronization, caching, error log-ging, security, fault tolerance, etc.) must be introduced at many points in the code.Such strategies/policies are said to “crosscut” the normal functionality of the program.AOSD, through its separate definition of crosscutting strategies and policies and itsability to “weave” them into the code at specified joinpoints, allows one to “untangle”code that would otherwise mix these different aspects of the software at the sourcelevel.

AOSD tools such as AspectJ™ (AspectJ Quick Reference, 2004), however, havea number of important limitations. They define point cuts syntactically rather thansemantically. They view behavior at the granularity of methods rather than individualactions. They focus primarily on control flow rather than data flow. And they have

A PATTERN-BASED FRAMEWORK FOR SOFTWARE ANOMALY DETECTION 107

Pattern = problem abstraction + pointcuts + transformations

Figure 3. Pattern components.

only weak support for code transformations, which consist primarily of simple codeinsertions before, after or around method code and method calls. The limitationsassociated with pointcut definitions and the type of program information the insertedcode can access also limit the range of problems that can be addressed.

In part this is by intent. AOSD is intended to be less powerful, but more elegant,more accessible and easier to apply than meta-level programming and transformationsbased on low level rewriting rules. At its current level of maturity, however, AOSDis simply insufficient to address the full range of patterns needed by high assurance,real-time, embedded, or distributed systems.

The definition of a pattern in XPSL includes an abstraction of the problem, followedby a list of pointcut definitions and transformations that provide a solution (Figure 3).

Patterns are defined by specifying what to look for (in the problem section), byspecifying the information we need to extract from the program to understand andsolve the problem (in the pointcut section), and by specifying the high level changes tobe made (in the transformation section). To address software inspection, we focus onthe first two parts: the identification of the problem, and the pointcut queries related toits solution.

The pointcut portion of the pattern is similar to a database query. The objective isto extract information from the software; this information can be for program com-prehension (e.g., recognizing variables that correspond to key domain-specific con-cepts), or can represent the information needed to apply a given transformation (e.g.,the points where synchronization code must be inserted). Similar to query processingin databases, we support a unified representation that provides access to both softwareartifacts and analysis results. A unified representation is essential to support compo-sition in XPSL, allowing the results obtained by applying one pattern to be consumedby another pattern.

AspectJ treats pointcuts as descriptors representing sets of points in the execution ofthe program, and provides set-like operators to compute the union, intersection, andcomplement of pointcuts. In our framework, we broaden the definition of pointcut toinclude arbitrary collections of model elements1, and support additional operations oncollections, sequences, and sets (such as those provided by XSLT (XSLT, 2004) andObject Constraint Language (OCL) (Object Management Group, 2002c)).

XPSL queries fall into one of two categories: Basic XCIL Model Queries andAnalysis Based Queries. The first category includes basic queries of the XCIL programrepresentation. XCIL is an abstraction of the syntax. XCIL also has a rich amount ofsemantic information. Traditional syntactic patterns are handled at this level. The sec-ond category includes more advanced queries based on the results of static analysis.Analysis tools effectively extend the XCIL model, providing their own views of it,supporting different user perspectives. Like pointcuts in AspectJ, a query may haveparameters representing values taken from the runtime environment. As sets of modelelements, Pointcuts may also be composed using set operations (such as intersection,difference, union, and complement).

108 KOTHARI ET AL.

4.1. Basic XCIL model queries

XCIL provides an extensible common intermediate language representation for meta-models with a semantics based on UML, the JVM, and MS-IL. All of the attributes,components, and associations defined by the XCIL model are accessible via XPSLmeta-level queries. The metaclass associated with an instance of a XCIL model el-ement may be requested using the query ‘metaclass()’, corresponding to oclType in(Warmer and Kleppe, 1999). We have given below a couple of examples of operationsavailable on XCIL model elements.

To access a property p of element e, we invoke a query with the same name. Theresult of the query is a value whose type is that of the property. For example, the nameof a model element ‘e’ may be referred to as ‘e.name()’. Simple set operations onelement properties are also assumed.

OCL path expressions (Object Management Group, 2002c), can be used to accessrelated elements by following the associations between them. For example, given anelement e, the name of the Namespace to which the element belongs may be referredto as ‘e.namespace().name()’.

In addition to basic queries/sets and the ability to navigate through the model us-ing path expressions, the metaclass definitions in XPSL provide additional operationson XCIL model elements, and the ability to construct and iterate over collections ofsuch elements. These operations are based on the OCL operations select, iterate, andcollect, and provide a context for the execution of associated transformations.

The XPSL collection metaclass includes collection, set, sequence, and bag. Thedetailed descriptions, including the operations defined for different types of collec-tions are given in the XPSL reference document available on the web (ISU–RockwellDARPA SEC project website, 2004). These are based on those defined by OCL (ObjectManagement Group, 2002a; Warmer and Kleppe, 1999). Since a pointcut is defined tobe a collection of model elements, operations on a collection such as union, intersec-tion, and difference can be used to support the composition of pointcuts in a mannersimilar to AspectJ. It is also possible to extend the current set of operations, e.g. toinclude other operations on collections supported by XML Query, XSLT, etc. In thecurrent version of XPSL we have not done so, restricting ourselves to a useful subsetof the operations provided by OCL.

A query statement defines a persistent variable to which results are assigned or inwhich results are collected. The XPSL definition contains a class called QueryState-ment. Its semantics are further defined by its subtypes, which include IteratePointcut,SelectPointcut, and CollectPointcut.

4.2. Analysis-based queries

In addition to basic queries involving the properties directly defined by XCIL, analysistools may extend the XCIL model, and create their own views of it. The followingqueries provide access to this information.

The meta level type associated with an instance of a XCIL model element may berequested using the query ‘metaType ()’, corresponding to oclType in (Warmer and

A PATTERN-BASED FRAMEWORK FOR SOFTWARE ANOMALY DETECTION 109

Kleppe, 1999). The meta type of an element may be tested to determine whether it isa subtype of another meta type using ‘isSubtypeOf (metaType)’.

Operations may be referred to, using OCL path expressions (Warmer and Kleppe,1999). For example, all elements in the control flow of a model element ‘e’ may berequested by writing ‘e.cflow ()’.

To support a notation similar to AspectJ, many of the same operations are also pro-vided on sets of XCIL elements, e.g., ModelElementSet’s. In some cases, a particularoperation (e.g., sharedVariables) applies only to the set as a whole (ThreadSet), and notto its individual members. In other cases, an operation applies only to the individualmembers and not to the set as a whole.

In XPSL, where it is possible to do so, we match the UML meta-model operationsand the naming conventions (for model element collections and operations) standard-ized by OMG (Object Management Group, 2004).

For more details, we refer the reader to the XPSL reference document and patterncatalogs on our web site (ISU–Rockwell DARPA SEC project website, 2004).

5. Formal patterns

In an earlier section, we have given two examples of patterns. We have describedthese patterns informally and pointed out a number of issues and difficulties to addressin creating a formal description of such patterns. In this section, we provide formaldescriptions of the two patterns.

5.1. XPSL syntax

An overview of the XPSL syntax is given in Tables 1–4. We have not covered the entireXPSL syntax here. We have covered only the syntax used in the paper to formalize thetwo examples of patterns. The syntax is divided into the following broad categories:(a) control structures, (b) basic constructs based on the XCIL model, (c) analysis basedconstructs, and (d) constraints.

Basic XPSL constructs are based on XCIL model elements to make it possible todefine patterns at the semantic level instead of depending on a programming languagespecific syntax. The benefit is that the syntactic differences are abstracted out andXPSL patterns become applicable across different languages. For example, XCIL does

Table 1. XPSL constructs for control structures.

SelectPointcut A pointcut representing a selection of model elements from a given collectionIteratePointcut A pointcut representing an iteration over the model elements of a given collectionCollectPointcut A pointcut representing a collection of model elements related by a mapping expression to

those in a given collectionBody The body of an iteration pointcutFrom An expression to associate with the collection of model elements to be searchedConstraint The constraint (if any) that must be satisfied for an element to be selected/included.ElementType The type of element to look for/iterate overExpression An expression (identical to those defined by OCL)

110 KOTHARI ET AL.

Table 2. XPSL constructs based on XCIL model elements.

ForLoop A loop that executes its body for each value of a loop index in a given rangeForLoop.test The test of the ForLoopForLoop.body The body of the ForLoopStatement The basic unit of behavioral specification, that does not return a valueCriticalRegionEnter Acquire the lock to enter into a critical regionCriticalRegionExit Release the lock while exiting from a critical regionCriticalRegionEnter.shared The lock (semaphore) acquired while entering the critical regionCriticalRegionExit.shared The lock (semaphore) lock released while exiting the critical region

Table 3. XPSL constructs based on program analysis.

action → variablesRead() All variables that may be read (either directly or indirectly) by executing theaction

action → variablesAssigned() All variables that may be assigned to (either directly or indirectly) by execut-ing the action

cflowbetween(elem1, elem2) Returns all elements in the path(s) of control flow between two points elem1and elem2

elem1 → cflowfirst(elem2) Returns the first instances of elem2 from the set of all elements on everycontrol flow path starting at elem1

Table 4. XPSL constructs used in constraints.

set1 → intersection(set2) Returns the intersection of set1 and set2set → size() Returns the number of elements in setset → includes(elem) Returns true if elem is a member of set, otherwise returns false

not depend on the specific syntax patterns used by programming languages, insteadit captures the semantics of loop statements by three types of loops: DoWhileLoop,ForLoop, and WhileLoop. Another example is the use of XCIL constructs Critical-RegionEnter and CriticalRegionExit to capture the semantics of the wait and signaloperations, respectively. Again, the point is not to depend on the syntactic variations.For example, p and v operations may have been used instead of wait and signal.

The analysis based XPSL constructs provide a way to encode complex properties ofprogram behavior. The benefit is that the domain-specific strategies and policies canbe expressed succinctly. For example, we have constructs such as variablesRead() orvariablesAssigned() that make it possible to describe the behavior of reading or writingwithout getting entangled into the endless possibilities of how it might happen, as wehave pointed out for the test variables reassignment pattern. Also, we have constructssuch as cflowfirst() and cflowbetween() that allow us to construct queries that requirecontrol flow information.

The XPSL constructs for specifying constraints are identical to the OCL constructs.These include constructs such as checking equality, forming union or intersection,or checking if a given element belongs to a specified set, etc. For example, we canspecify those statements that affect the outcome of test by using the constraint that theintersection of certain sets is not empty. This constraint is used in the formalization ofthe test variables reassignment pattern as follows:

A PATTERN-BASED FRAMEWORK FOR SOFTWARE ANOMALY DETECTION 111

test.variablesRead() → intersection(statement.variablesAssigned())

→ size() > 0

The XPSL is a specification language. It is intended to focus on what as opposed tohow. An analysis-based query in XPSL succinctly describes the characteristics of pro-gram behavior without having to go into the details of the program analysis neededto identify those characteristics. For example, the constraint shown above specifiesconcisely and precisely what to look for to identify the loops in which a test variable ismodified by a statement in the loop body. The construct statement.variablesAssigned()specifies the set of variables modified (assigned) by a statement. A significant amountof program analysis may be needed to find the set. For example, the variable may havebeen accessed through a pointer and may require pointer analysis to identify actualvariables being assigned by the statement. The specification does not get entangled inthe type of analysis that is required. Similarly, to find a pair of lock and unlock oper-ations, the XPSL specification uses lock.cflowfirst(unlocks). The construct cflowfirst()allows one to refer to the first instances of a given type of element along every programexecution path starting from a given point in the program. This is done without havingto go into the details of the control flow analysis required to find such instances.

5.2. Formalization of the test variable reassignment pattern

The pattern in Table 5 represents a query that will return those for loops where testvariables are modified inside the loop body. The formal specification of the pattern inXPSL is given in Table 5. The outermost SelectPointcut statement indicates that weare interested in all ForLoops within a user specified body of software. The name ofthe resulting set is loops. As we examine each of these loops individually, they areassigned to an iteration variable named loop. The innermost SelectPointcut statementextracts those statements from the body of each loop that affect the outcome of its test.

Table 5. Test variable reassignment pattern specified in XPSL.

SelectPointcut loopsDescription The set of all for loopsFrom $AllProgramElementsElementType ForLoopIteratePointcut loopComment For each loop from this setFrom loops

BodySelectPointcut testAssignmentsComment Identify any assignments that affect the outcome of the test conditionFrom loop.bodyElementType StatementConstraint test.variablesRead() → intersection(statement.variablesAssigned()) → size() > 0CollectPointcut reassignLoopsComment A loop in which a test variable is modified in the body of the loopExpression loopend

end

112 KOTHARI ET AL.

The for loops where such statements are present are gathered using the CollectPointcutstatement.

The pattern references the children “body” and “test” of ForLoop, in accordancewith its XCIL definition. Other expressions refer to analysis results. test.variablesRead()and statement.variablesAssigned(), for instance, are XPSL constructs that require pro-gram analysis capabilities.

This example also shows how we extend the definition of pointcuts provided byAspectJ (AspectJ Quick Reference, 2004). In the for loop test assignment pattern, forinstance, we need to be able to ask for all the variables appearing in a given expression(the test expression) and ask which statements from the body of the loop appear withina data flow that sets these variables. Neither of these expressions can be written inAspectJ, which works only at the method level.

In the definition of a pattern, an appropriate problem abstraction is critical in order toprevent the analysis from becoming intractable. Essentially, the problem abstractionenables one to focus on the core information necessary for solving a problem. Forinstance, for loop, its index variable and the write set of the loop body, etc. constitutethe core information to look for in this example.

5.3. Formalization of the reverse locking pattern

We focus only on the reverse locking as one of the many issues discussed through theinformal example. A formal description of patterns that cover all the issues is quitelong. It is fairly complex to describe even in English and it should not be surprising thatthe formal description is complex too. A formal specification of the “reverse locking”is given in Table 6.

The first two pointcuts, locks and unlocks, match all critical region entry and exitpoints in a given body of code. From these points we create pairs of correspondingenter (lock) and exit (unlock) operations. The criteria for creating a pair are that (a) theentry and exit operations must share the same lock for synchronization, and (b) the exitis the first such operation on an execution path from the given entry operation.

After these lock/unlock pairs have been identified, the execution intervals betweencorresponding lock and unlock operations may be examined. These intervals are com-pared to each other, two at a time. Assume the intervals being compared are thoseassociated with the pairs syncPair1 and syncPair2. A reverse locking violation occursif a lock (innerLock1) in the interval defined by syncPair1 is same as the lock used bysyncPair2, and a lock (innerLock2) within the interval defined by syncPair2 is same asthe lock used by syncPair1.

5.4. Other pattern examples

We have created the SEC High Assurance Patterns Catalog as a part of a DARPA Soft-ware Enabled Controls (SEC) project. It contains a number of other formally specifiedpatterns related to control flow, data integrity, pointer integrity, synchronization, per-missible side effects, the use of multiple inheritance, design by contract, behavioralsubtyping, coupling and cohesion, object-oriented metrics, dynamic resource alloca-tion, and safe language subsets.

A PATTERN-BASED FRAMEWORK FOR SOFTWARE ANOMALY DETECTION 113

Table 6. Reverse locking pattern specified in XPSL.

SelectPointcut locksComment The set of all critical region entry (lock) operationsFrom $AllProgramElementsElementType CriticalRegionEnterSelectPointcut unlocksComment The set of all critical region exit (unlock) operationsFrom $AllProgramElementsElementType CriticalRegionExitIteratePointcut lockComment An invocation of a ‘lock’ operation from this setFrom locksBody

IteratePointcut unlockComment A corresponding ‘unlock’From lock.cflowfirst(unlocks)Constraint lock.shared() = unlock.shared()Body

CollectPointcut syncPairsComment All pairs of corresponding lock/unlock invocationsExpression sequence [lock, unlock]

endendIteratePointcut syncPair1Comment For each pair, syncPair1From syncPairsBody

IteratePointcut innerLock1Comment For each lock, innerLock1, appearing within this scopeFrom cflowbetween(syncPair1 → first(), syncPair1 → last())Constraint locks → includes(self) and (self.shared() != syncPair1 → first().shared())Comment If the action itself is a lock, and the inner lock applies to a different resourceBody

IteratePointcut syncPair2Comment For each pair, syncPair2From syncPairsConstraint (syncPair1 → first().shared() != syncPair2 → first().shared())Comment The two outer locks are differentBody

IteratePointcut innerLock2Comment For each lock, innerLock2, appearing within this scopeFrom cflowbetween(syncPair2 → first(), syncPair2 → last())Constraint locks → includes(self) and (self.shared() != syncPair2 → first().shared())Comment If the action is itself a lock, and the inner lock is for a different resourceBody

CollectPointcut reverseLockingComment Violation of the reverse locking patternFrom sequence [syncPair1, syncPair2]Constraint (syncPair1 → first().shared() = innerLock2.shared()) and

(syncPair2 → first().shared() = innerLock1.shared())Comment that are locked in reverse order by the two inner locksExpression sequence [syncPair1, syncPair2]

endend

endend

114 KOTHARI ET AL.

Figure 4. A schematic diagram of the framework.

6. The KCS framework for developing tools

A schematic diagram for the KCS framework appears in Figure 4. A parser convertssource code to an extensible common intermediate language (XCIL) representationof the program. We currently have XCIL converters for FORTRAN, C, C++, andCOBOL, and plan to develop a converter for Java in the near future. However, the closerelationships between XCIL and MS-IL, between XCIL and the JVM, and betweenXCIL and UML/XMI make it possible to integrate with a variety of other tools.

The core analyzers perform various types of generic program analysis whose resultscan be reused by the pattern-based analyzers. For example, the Block-Level AbstractSyntax Tree (BLAST) analyzer creates a BLAST, and annotates it with the READ andWRITE information for each statement and block. With respect to this analysis, ablock represents a piece of code such as a loop body or a linear code segment with nocontrol statements.

To date XPSL has been used to formally specify the patterns appearing in the HighAssurance Patterns Catalog (ISU–Rockwell DARPA SEC project website, 2004), andas a basis for their subsequent manual implementation. In the future, XPSL patternswill be automatically converted to an executable form. The information required toexecute an XSPL pattern generally falls into three categories: local tree matching on asemantically rich AST, control and data flow analyses, and global relationship queries.Although we have not found cases that cannot be handled by a combination of thesethree, our architecture can accommodate other types of analysis by allowing otheranalyzers to place information into the repository.

We have had great success implementing local patterns using XSLT (XSLT, 2004).For example, the use of pointer arithmetic can be detected by looking for an arithmeticoperator with at least one pointer operand. While the description of this pattern isdeceptively simple, the determination that an operand is a pointer may require theexamination of a number of different declarations in the source code. For example:

A PATTERN-BASED FRAMEWORK FOR SOFTWARE ANOMALY DETECTION 115

typedef int *handle;typedef handle transaction_handle;transaction_handle p1;

Although the analysis itself is straight forward, it is difficult to obtain reliable resultswhen it must be applied manually to a large body of software. This is particularly truewhen the relevant information is spread across multiple source files. As a result, a va-riety of use and flow analyses such as read/write, def-use chaining, points-to analysis,and abstract interpretation such as null pointer detection have been implemented inXSLT, C++, and Java. Global relationship patterns such as the detection of less visibleoperations (LVO) in subclass relationships are handled using an SQL database. A LVOviolation is found when an overridden virtual class method’s visibility is reduced byan inheriting class. For example:

class A{public:

abstract virtual void foo()=0;}

class B:A{protected:

void foo();}

Although this violation is obvious, manually searching for all LVO subclass operationsin a large body of software (such as the 450,000 line middleware services software forthe SEC DARPA project), where class definitions are spread across many header files,is difficult and error prone.

7. High assurance avionics systems

Increasingly, software is one of the most critical components in high assurance, safetycritical systems. For example, software provides the critical control logic necessaryfor flight guidance in commercial avionics systems, and is used to perform extrememaneuvers (a goal of the DARPA Software Enabled Controls program that funded thisresearch) that would not be possible without software assistance. In such systems,software defects can easily cause a loss of life, loss of aircraft, or mission failure. Forthis reason, software quality is a key concern when it comes to the certification of suchsystems to FAA or military standards. FAA standards (DO-178B) (Software Consid-erations, 2004) for software development, in particular, place a premium on analysisand verification activities as part of an overall high assurance software developmentprocess. Such activities can consume as much as two thirds of a project’s budget.Even so, tight schedules and cost considerations place limits on what can be done,making it difficult to achieve the desired degree of confidence.

To meet the goals of the DARPA Software Enabled Controls (SEC) program, Rock-well Collins created a list of 83 key issues (ISU–Rockwell DARPA SEC project web-site, 2004) related to the development of high assurance avionics software. This list

116 KOTHARI ET AL.

includes both design and language level issues, issues that represent obstacles to theuse of the types of analysis called for by (Software Considerations, 2004) and (ISOAda Standard, 2004), issues related to the verification and testing process, issues re-lated to long term maintenance of the software, and potential run-time errors.

In general, addressing these issues is difficult. The inspection process must look atnot only individual software modules but, also their potential interactions in a numberof different system and operational contexts.

In the case of the SEC program, our focus has been on analysis and adaptationof the middleware services needed to support flight control using hybrid automata toachieve a degree of maneuverability that is only possible with software assistance. Thesoftware is object-oriented and written in C++.

A number of the patterns in our catalog are based on guidelines for the use of object-oriented technology in aviation (OOTiA) (OOTIA web site, 2004) developed by Rock-well Collins, Boeing, Honeywell, Goodrich, and reviewed in a series of FAA/NASAsponsored workshops (Multiple Inheritance, 2004). Other patterns are based on guide-lines concerned with data integrity, the use of pointers, concurrency, the use of dy-namic allocation following initialization, etc. These guidelines are taken from thedesign and coding standards for high assurance software used by Boeing and Rock-well Collins, from the ISO High Integrity Ada standard (ISO Ada Standard, 2004),and other similar sources. There are currently some forty patterns that address themost important of the issues from our list.

Initially we performed a number of manual inspections of the SEC middleware tohelp us determine the nature of the problems most often found in practice. Over thecourse of this project, however, we learned to increasingly rely on the tools to help usdetect likely problems, and to create refined versions of our initial patterns based onthe problems found. Doing so without basic tool support for exploring the softwareand refining the patterns would have been infeasible for an application the size of theSEC middleware services.

Currently the KCS toolset implements eight of the patterns addressing ten of themost important issues. Some of these patterns require only a local analysis, while oth-ers require a system-wide global analysis. In applying these tools to the SEC software,we have been able to find hundreds of violations of the patterns with an efficiency andaccuracy that would have been impossible to achieve by more conventional means,even with a larger inspection team. For the purpose of adaptation, we have chosen tofocus on patterns related to the verification of subtyping relationships, and the refac-toring of the class hierarchy to correct commonly found problems.

8. Conclusions

Our research is aimed at the construction of a Knowledge-Centric Software (KCS)framework for building tools that perform software inspections, analysis, and transfor-mations. Compared to frameworks such as (Antoniol et al., 1997; Harandi and Ning,1990), the KCS framework provides:

• A pattern-based approach for creating customized tools for different applicationdomains.

A PATTERN-BASED FRAMEWORK FOR SOFTWARE ANOMALY DETECTION 117

• An eXtensible Common Intermediate Language (XCIL) to address many languageswith a single toolset.

• An eXtensible Pattern Specification Language (XPSL) to serve as an umbrella formultiple analysis/transformation technologies.

• A core set of analyzers to support queries and transformations that require globaldata and control flow analysis.

• XML-based representations for leveraging XML tools and also provide interoper-ability with other tools.

• Standardization and compliance of XCIL with MSIL, JVM, and UML action se-mantics.

This paper focuses on the development of pattern-based tools for automated softwareinspection. The tools are customizable for different domains through the developmentof domain-specific pattern catalogs. The paper describes our approach to the devel-opment of such tools. Examples are given to illustrate the approach. As a part of aDARPA SEC project, we are applying the KCS toolset to the inspection of high assur-ance software for safety-critical avionics systems. In applying these tools to the SECsoftware, we have been able to find hundreds of violations of the patterns with an effi-ciency and accuracy that would have been impossible to achieve by more conventionalmeans.

To fully support the features of XPSL, we intend to develop a powerful query-processing engine that leverages the use of XML technologies such XQUERY (W3XML Query web site, 2004; XSLT, 2004) and the proposed OMG Query View Trans-formation (QVT) standard (Object Management Group, 2002b). We also expect todraw on ideas from database research on graph-based query-processing and query op-timization (Flesca, Furfaro, and Greco, 2002).

While creating the KCS framework, we have paid attention to the practical barriersthat often preclude the use of automation tools. In our experience, two of the mostcommon problems are the lack of usability and scalability. For example, we mustaddress the problem of information explosion. This is especially important for largeand complex systems. Of course, such systems also are the ones with the greatestneeds and potential benefits. Based on our experience working with large softwaresystems, such as the 450,000 lines of C++ SEC code for middleware services, we aredeploying parallel computing to speed up the analysis. To present the analysis resultsto the user in a meaningful way, we apply a variety of visualization techniques. Wealso report results in a number of different formats, including database representationsfrom which users can create multiple views by writing SQL queries.

This area is rich with opportunities for future research, related both to the enhance-ment of the framework, and to its application to build tools for different domains. Incollaboration with Rockwell Collins, we plan to develop additional pattern catalogsthat capture their knowledge of the real-time, embedded, and fault tolerance aspectsof software. We plan to use these, and patterns gathered from other sources (Dou-glass, 2002; Schmidt et al., 2000), to help us in developing a formalism that is firmlyanchored in real-life software problems.

Other topics of interest include developing a variety of tools including tools forsoftware metrics (McCabe and Watson, 1994; Littlefair, 2001), applying the toolset to

118 KOTHARI ET AL.

various domains, supporting various modeling and programming languages, and inte-grating with other analysis and development tools. Recent trends toward the generationof code from high-level models (model based development), and the integration of thiscode with commercial off-the-shelf software (e.g., for CORBA) also suggest a need tosupport a mixture of model and code analysis on the same project.

Note

1. Rather than only those elements associated with program traces.

References

Anderson, P., Reps, T., Teitelbaum, T., and Zarins, M. 2003. Tool support for fine-grained software inspection,IEEE Software, July: 42–50.

Antoniol, G., Fiutem, R., Lutteri, G., Tonella, P., Zanfei, S. and Merlo, E. 1997. Program understanding andmaintenance with the CANTO environment, International Conference on Software Maintenance, Bari, Italy,1–3 October 1997, pp. 72–81.

AspectJ Quick Reference. http://www.aspectj.org/doc/dist/quick.pdf.Aurum, A., Petersson, H., and Wahlin, C. 2002. State-of-the-art: Software inspection after 25 years, Software

Testing Verification Reliability 12(3): 133–154.Bishop, P. et al. 2002. Software critical analysis of COTS/SOUP, Computer Safety, Reliability and Security: Proc.

21st Int’l Conf. Safecomp 2002, Lecture Notes in Computer Science, Vol. 2434, pp. 198–211. Springer-Verlag.Chilenski, J.J. and Miller, S.P. 1994. Applicability of modified condition/decision coverage to software testing,

Software Engineering Journal 7(5): 193–200.Cscope homepage. http://cscope.sourceforge.net/.Daugherty, G. and Kothari, S.C. 2002. Pattern examples, version 1.2o, Rockwell Collins Advanced Technology

Center and Iowa State University, November.Douglass, B.P. 2002. Real-Time Design Patterns, Robust Scalable Architecture for Real-Time Systems. Addison-

Wesley Professional.FAA/NASA. 2002. Multiple inheritance, FAA/NASA workshop position paper, OOTiA-1, version 2.1, Proceed-

ings of the 1st FAA/NASA Workshop on Object-Oriented Technology in Aviation (OOTiA), Norfolk, VA, April2002, to appear.

Fagan, M.E. 1976. Design and code inspection to reduce errors in program development, IBM Systems J. 15(3):182–211.

Features: Code inspection. http://www.intellij.com/idea/features/6.Fiutem, R., Tonella, P., Antoniol, G., and Merlo, E. 1996. A cliché-based environment to support architectural

reverse engineering, Proc. of IEEE Conf. on Software Maintenance, Monterey, November, pp. 319–328.Flesca, S., Furfaro, F., and Greco, S. 2002. XGL: A graphical query language for XML, Proceedings of the

International Database Engineering and Applications Symposium (IDEAS’02), July 17–19, 2002, Edmonton,Canada.

Gamma, E., Helm, R. et al. 1995. Design Patterns—Elements of Reusable Object-Oriented Software. Addison-Wesley.

Gilb, T. and Graham, D. 1993. Software Inspection. Addison-Wesley.Harandi, M.T. and Ning, J.Q. 1990. Knowledge-based program analysis, IEEE Software 7(1): 74–81.High-Assurance Java Virtual Machine. http://www.kestrel.edu/home/projects/java/.ISO Ada Standard. http://www.adaic.org/compilers/acaa.html.ISU–Rockwell DARPA SEC project website. http://dirac.ee.iastate.edu/sec/.JSIS: Semantic Interface Specification for Java Technology. http://www.jsistools.com/download/.Kamperman, J. Automated software inspection: A new approach to increased software quality and productivity,

http://www.apacheweek.com/issues/.

A PATTERN-BASED FRAMEWORK FOR SOFTWARE ANOMALY DETECTION 119

Kiczales, G., Lamping, J., Mendhekar, A., Maeda, C., Lopes, C.V., Loingtier, J.-M., and Irwin, J. 1997. Aspect-oriented programming, Proceedings of the European Conference on Object-Oriented Programming (ECOOP),Finland, Lecture Notes in Computer Science, Vol. 1241. Springer-Verlag.

Kothari, S.C. 2002. Automatic parallelization, Aspect-Oriented Programming, and Beyond, High-PerformanceComputing Asia Conference, Banglore, India, December 2002 (invited paper).

Laitenberger, O. 2002. A survey of software inspection technologies, Handbook on Software Engineering andKnowledge Engineering, Vol. 2, pp. 517–555. World Scientific Publishing.

Littlefair, T. 2001. An Investigation into the Use of Software Code Metrics in the Industrial Software DevelopmentEnvironment, Ph.D. Thesis, Cowan University, http://www.fste.ac.cowan.edu.au/~tlittlef/.

McCabe, T. and Watson, A. 1994. Software complexity, Crosstalk, J. Defense Software Engrg. 7(12).Mitra, S. and Kothari, S. 1997. Parallelization agent: A new approach to parallelization of legacy codes, 8th SIAM

Conference on Parallel Processing for Scientific Computing.Object Management Group. IDL available from http://www.omg.org/.Object Management Group. 2002a. OMG Unified Modeling Language Specification (Action Semantics), version

1.4 with Action Semantics, final adopted specification, available from http://www.omg.org/.Object Management Group. 2002b. Request For Proposal: MOF 2.0 Query/View/Transformations RFP, ad/2002-

04-10, April 24.Object Management Group. 2002c. UML 1.4 with Action Semantics, Chapter 6, Object Constraint Language

Specification, available from http://www.omg.org/.OOTIA web site. http://shemesh.larc.nasa.gov/foot/.Parnas, D.L. 1994. Inspection of safety critical software using function tables, Proc. IFIF 13th World Computer

Congress, Vol. 3, pp. 270–277. North-Holland.Parnas, D.L. and Lawford, M. 2003. Inspection’s role in software quality assurance, IEEE Software, July: 16–20.Paul, S. and Prakash, A. 1994. A framework for source code search using program patterns, IEEE Transactions

on Software Engineering 20(6): 463–475.Reasoning Inc. Automated software inspection: A new approach to increased software quality and productivity,

http://www.reasoning.com.Schmidt, D. et al. 2000. Pattern-Oriented Software Architecture, Vol. 2. Patterns for Concurrent and Networked

Objects. New York, Wiley.Software Considerations in Airborne Systems and Equipment Certification. 1992. Document No. RTCA/DO-

178B, RTCA Inc., Washington, DC, December 1.Software metrics and static analysis. http://www.cs.queensu.ca/Software-Engineering/archive/static.html.Tanenbaum, A.S. 2001. Modern Operating Systems, 2nd ed. Prentice-Hall.Viega, J., Bloch, J.T., Kohno, T., and McCraw, G. ITS4: A static vulnerability scanner for C and C++ code,

http://www.cigital.com/its4/.Wagner, D. Static analysis and software assurance, http://www.cs.berkeley.edu/~daw/talks/sas01.ppt.Warmer, J. and Kleppe, A. 1999. The Object Constraint Language: Precise Modeling with UML. Reading, MA,

Addison-Wesley.W3 XML Query web site. http://www.w3.org/XML/Query.XSLT—Extensible stylesheet language transformation. http://www.w3.org/TR/xslt.

Dr. Kothari got his Ph.D. in mathematics from Purdue University in 1977. After teach-ing at the University of Oklahoma, he joined the Computer Science Department at IowaState University in 1984. He joined the Electrical and Computer Engineering Departmentin 1999. He leads the Software Systems Group in the department.

He has pioneered the Knowledge-Centric Software (KCS) technology for automationtools for maintenance and evolution of large software. He has developed significant appli-cations of the technology in a number of areas including parallel computing, high assur-ance software, and legacy software in business applications. He is the founder of EnSoft,a company at ISU Research Park that commercializes the KCS technology. He teachescourses in distributed computing, operating systems, parallel computing, and softwareengineering.

120 KOTHARI ET AL.

Luke Bishop received his B.S. in computer engineering from Iowa State University,and is currently in graduate school at ISU. Luke will graduate with a M.S. in com-puter engineering from the Software Systems Group in May of 2004. Luke works inthe Knowledge-Centric Software (KCS) laboratory to develop technology and tools foranalysis, maintenance and evolution of complex software. His main research work in-volves developing a method for performing incremental slicing-based impact analysis.Other research interests include software engineering, real-time systems, and distributedsystems. Luke is also an employee of EnSoft, a company formed to commercialize KCStechnology.

Jeremias Sauceda has no formal education in computer science or any related field.He taught himself how to program at age 6 from a college text. From ages 14 to 18he was interested in micro-kernel operating systems design, compiler optimization, andradiosity rendering. Since age 19 Jeremias has researched Knowlege-Centric Software(KCS) technology for automation tools for maintenance and evolution of large software atIowa State University with Dr. Kothari. He has developed significant applications of thetechnology in a number of areas including, high assurance software, and legacy softwarein business applications. He is the Chief Technology Officer of EnSoft, a company atISU Research Park that commercializes KCS technology. He hopes to complete a degreein Physics in the near future.

Gary Daugherty is a Senior Technical Staff member with Rockwell Collins’ Advanced Technology Center inCedar Rapids, IA. He received his B.S. in electrical engineering from West Virginia University and attendedgraduate school at the same institution. His primary research interests are in object-oriented modeling and meta-modeling, aspects and patterns, feature-oriented programming, the formal specification of software interfaces andbehavior, and their application to the development of safety-critical, distributed, real-time, embedded softwaresystems. Mr. Daugherty helped lead efforts to develop standards for the use of Object-Oriented Technology inAviation, since adopted by the FAA and NASA, and worked with Iowa State University in the development oftools to support the pattern-driven adaptation of middleware for high assurance applications as part of the DARPASoftware Enabled Controls (SEC) program.