A Generic Integration Architecture for Cooperative Information Systems

A Generic Integration Architecture for Cooperative InformationSystemsJohn Mylopoulos Avigdor Gal Kostas KontogiannisMartin StanleyDepartment of Computer ScienceUniversity of TorontoAbstractCooperative information systems consist of existinglegacy systems integrated in terms of a generic architec-ture which supports data integration and coordinationamong the integrated components. This paper presentsa proposal for a generic integration architecture namedCoopWARE. The architecture is presented in terms ofthe mechanisms it provides for data integration, andcoordination. Data integration is supported by an in-formation repository with an extensible schema, whilecoordination is facilitated by a rule set and an event-driven rule execution mechanism. In addition, the pa-per describes implementation and application experi-ences for the architecture in the context of a 3-yearsoftware engineering project.keywords: cooperative information systems1 IntroductionTraditionally, information systems have been de-�ned as software systems consisting of databases, appli-cation programs and user interfaces. However, currenttrends in business organizations point to a paradigmshift in organizational structures, away from tradi-tional, task-based forms and towards goal- or customer-oriented processes [16]. These trends are forcing a newview of information systems, hereby referred to as co-operative information systems. According to this view,the e�ectiveness of an information system within anorganization is not determined solely by the quality ofits components, which may have been developed inde-pendently and may pre-date the cooperative informa-tion system, but also (and perhaps most importantly)by the architecture through which the system is inte-grated with the rest of the organization so that it cancontribute to global business processes and organiza-

tional objectives. A cooperative information systemconsists of legacy systems, work ows and other soft-ware components developed independently and chang-ing over time. All these are integrated through an ar-chitecture so that the overall system continuously pro-vides useful information services to the organizationwithin which it functions.Such integration architectures are being explored inseveral di�erent areas of computer science, includingDatabases where the emphasis is on interoperationand data integration [19], Software Engineering wheretool and environment integration issues dominate [5],AI where systems consisting of distributed intelligentagents are being developed and explored [3] and ofcourse, Information Systems [23].This paper proposes CoopWARE, a generic integra-tion architecture, based on active database technology.It presents a natural, exible and easy to set mecha-nism for supporting cooperative information systems.In particular, CoopWARE supports a cooperation withdi�erent levels of data integration, which results in ahighly exible tool for emerging cooperation among in-formation systems. Our reported experiences are basedprimarily on a project whose aim has been to integratea variety of reverse engineering tools in support of re-verse engineering tasks [31].2 The CoopWARE Integration Archi-tectureIn a nutshell, an integration architecture provides ageneric framework for information exchange and co-ordination among a variety of existing software sys-tems. A primary requirement for the integration archi-tecture is that it provides a set of information serviceswhich may not have been anticipated during develop-ment of any of the component software systems. More-over, these services evolve over time to meet changing

Tool1

InterfaceLibrary

RuleSet

RuleEngine

InformationSchema

InterfaceLibrary

Machine A

Tool2

Tool3

InterfaceLibrary

InterfaceLibrary

Machine B

InterfaceLibrary

Figure 1. The Integration Architecturebusiness requirements. To be general, the frameworkmust make as few assumptions as possible about thesoftware systems being integrated. In particular, theframework should require as little as possible in modi-�cations of the components being integrated. Finally,the framework needs to be open in the sense that com-ponent systems can be added and/or removed easilywithout drastically a�ecting the functionality or the ef-fectiveness of the overall system. Two primary designgoals for the integration architecture proposed here aremodularity and the ability to operate either on a sin-gle host or over a network. Modularity was adoptedbecause it facilitates customization (or even replace-ment) of implementation components as needed. Forinstance, the underlying message transport software (inour case mbus, see below) can be easily changed, if amore desirable system is found.These requirements are addressed in the Coop-WARE Integration Architecture through a centralizedarchitecture which consists of an information reposi-tory for representing and maintaining a variety of in-formation generated or used by any of the integratedcomponents, also a coordination mechanism throughwhich the integrated components can participate andcontribute in a coordinated fashion to global informa-tion services provided by the cooperative informationsystem.Figure 1 presents the general structure of Coop-WARE. A particular instance of CoopWARE consistsof a coordinator which contains an information schemabased on an extensible informationmodel,1 a coordina-tion rule set, a rule execution engine, also a collection ofinterfaces, one for each integrated software componentand one for the coordinator. Each interface de�nes aset of services which can be executed by the compo-nent associated with the interface, also a set of eventsthrough which the component signals the beginning or1We use the term information model in the sense that datamodel is used in databases, i.e. an information model that pro-vides a set of data structures and associated operations and con-straints for representing information.

end of execution of an internal process. One set ofservices common to all interfaces provides facilities formessage passing among components.As a concrete example of the integration architec-ture, suppose that we wish to integrate two reverseengineering tools, Re�ne and Rigi. Re�ne is a commer-cial code analysis tool [21] which provides facilities forparsing the code being reverse engineered to generateabstract syntax trees, as well as a high level languagefor specifying patterns to be matched against abstractsyntax trees. Such patterns can be de�ned to supportthe analysis of code by determining, for example, thecall- or use-graph structure of a given piece of code.Rigi, on the other hand, is a research prototype [25]whose primary function is to assist in program under-standing by o�ering useful visualizations of code struc-ture. Rigi also o�ers a set of code analysis functions.One would ideally want to use either Re�ne or Rigi toanalyze code and then display the results of the anal-ysis using Rigi. Hence, there is a need to integrate thetwo tools in support of a reverse engineering activity.In integrating a new component into the environ-ment, there are three steps that need to be taken, pos-sibly in an iterative fashion, namely data integration,de�nition of services, events and rules and �nally reg-istration.Data Integration determines the information needsof each new component to be integrated and up-dates the information schema as appropriate. Inthe case of the reverse engineering project, thisstep determines what information is input or pro-duced by each tool. For instance, both Rigi andRe�ne operate on �les, but treat them di�erentlyand associate to �le objects di�erent informa-tion. If there already exists a schema integratingk components, data integration of the (k+1)stcomponent involves extending this schema to ac-commodate the information used by the (k+1)sttool. To properly support the data integrationactivity, an information model needs to be exten-sible and to provide for annotations of attributes,so that one can note which component is usingwhich attribute.Services, Events and Rules: This step determineswhat services will be o�er by the new compo-nent to the cooperative information system, alsowhat internal events it will report on. In ad-dition, this step determines what coordinationneeds to exist between the newly integrated com-ponent and others. This is accomplished throughevent-condition-action rules which trigger certainrules when certain events-condition pairs occur.

Rigi services:Parse: hSource-CodeiSG-Analysis: hRigi-AST, Grouped-ComponentsiRigi-Upload: hKB-Name, QueryiRigi-Download: hKB-Name, Set-of-TuplesiRigi events:Rigi-Parses: hSource-CodeiAST-updated: hRigi-ASTiRe�ne services:Parse: hSource-CodeiCBC-Analysis: hRe�ne-ASTiRe�ne-Upload: hKB-Name, QueryiRe�ne-Download: hKB-Name, Set-of-TuplesiRe�ne events:Grouped-Components-Updated:hGrouped-ComponentsiFigure 2. The services, events and messagesof the case studyFigure 2 presents an example of some of the ser-vices and events that might be declared for Rigiand Re�ne. Each service has a name, possibly fol-lowed by a set of parameters, some of which maybe part of the information schema. For instance,Parse is a service o�ered by both tools and takessource code to generate an abstract syntax tree.SG-Analysis performs a code structure analysisby taking as input an abstract syntax tree andgenerating a graph. CBC-Analysis is a Re�ne ser-vice that performs a clone-based-clustering anal-ysis on code data, measuring a number of met-rics on code fragments and then clustering thecode fragments according to the metric values re-turned. Both components have upload and down-load operations for retrieving and updating data.Services are activated through messages. In gen-eral, the coordinator needs to maintain a map-ping from the service logical name to the physicalprocedure which carries out this service withina particular component. For example, Re�necan translate a Parse operation to a command asgiven in an Emacs environment. Each such ser-vice execution results in a success/failure messagefrom the tool to the coordinator.Events represent occurrences of some phe-nomenon happening within a tool. For example,Rigi-Parse informs the world each time Rigi per-forms a parse operation. This event has as associ-

ated informationa pointer to the source code thatis being parsed. Rigi's AST-updated informs theworld that an abstract syntax tree has been up-dated. The event Grouped-Components-Updatednoti�es the world of a change in Re�ne's groupedcomponents.Registration make use of an existing interface libraryto provide a message-based communication inter-face between a new component and the coordi-nator, (...and through the coordinator, commu-nication with any other integrated component).How each component actually connects to thearchitecture depends on what kind of access toits internals is available. We support a range ofmethodologies from black box to white box [30]and various shades of gray in between. For ex-ample, given that Re�ne runs on top of LISP,we may want to treat Re�ne as a black box withwhich we communicate through an external me-diator program. In the case of Rigi, on the otherhand, where we have full access to its source code,we may be able to integrate directly. From thepoint of view of the coordinator and the integra-tion architecture in general, both tool interfacesoperate identically.Once the communication architecture is in place,the new component then registers with the coor-dinator. This is done dynamically and becomese�ective immediately. Thus, components can beadded to the architecture at any time and theycan also modify their capabilities at will. As eachcomponent registers its capabilities with the co-ordinator, the capabilities of the cooperative in-formation system grow.The architecture can evolve both because ofchanges in the registration pro�le of particularcomponents or because of changes in the coor-dinator's information schema. For example, thenotion of Request-Message may be added to theschema so that the coordinator can deal with anew type of messages. For the current imple-mentation, there are a number of generic mes-sage types and events, e.g. Request and Responsemessages, and Message-Delivered events. If onewanted to de�ne a new, special purpose messagetype, it is a simple matter to specialize the Mes-sage class, adding whatever message handling isrequired for the new message type through coor-dinator rules.

Figure 3. The three layers of the repositoryschema3 The Information RepositoryAn information repository is a computer-based sys-tem which stores, accesses and manages informationabout information sources and/or information pro-cesses. These information sources may be computer-based �les, databases and other data structures, whileinformation processes may be application programs orother procedures which operate on information sourcesto update or retrieve information. An informationrepository is di�erent from a database in that, like datadictionaries and data warehouses, it stores informationabout computerized information, rather than informa-tion about some application external to computers.The repository's main function is to facilitate dataintegration among the various components of the ar-chitecture. As such, it needs to provide an informa-tion model that is expressive, extensible and e�cient.It needs to be expressive so that it can model thedata sharing needs of the components easily and un-derstandably; it needs to be extensible so that it canaccommodate run-time changes (in particular it needsto support dynamic schema evolution). The informa-tion model needs to be supported with an e�cient im-plementation so it does not become a bottleneck.The information model adopted for CoopWARE isbased on Telos [26]. Its features include an object-oriented framework which supports generalization (in-cluding support for multiple inheritance), classi�cation(including support for multiple instantiation) and attri-bution (with multi-valued attributes), a general meta-modelling facility whereby classes are objects too andare instances of metaclasses and a novel treatment ofattributes including multiple inheritance and instanti-ation of attributes and attribute classes. This treat-

ment of attributes provides a powerful mechanism forde�ning multiple overlapping views of an informationobject. This allows the repository to have a consis-tent and singular representation of information that isshared among the components in the integration archi-tecture and at the same time allow for each compo-nent's unique perspective on that information.Figure 3 presents a schema to facilitate informationsharing among the integrated tools to share data for thereverse engineering project. It consists of three layers.The top layer exploits the meta-modelling facilities ofTelos to de�ne: (i) the types of attribute values thatthe repository will support; and (ii) useful groupingsof attributes for purposes of distinguishing informa-tion that is pertinent to each of the individual tools.The use of this layer facilitates schema evolution andprovides a �ltering mechanism by which each tool canaccess its own view of a commondata object. The mid-dle layer de�nes the repository schema with the aid ofthe metaclasses and attribute de�nitions from the toplayer. The bottom layer stores the actual data sharedamong the individual tools.Figure 4 shows a detailed example of a portion ofthis global schema. At the metaclass level, the schemaincludes the metaclass ObjectClass which has associ-ated attribute metaclasses singleValue, setValue, andsequenceValue. All instances of this metaclass can haveassociated attributes which are classi�ed under one (ormore) of these categories, depending on the range ofvalues intended. ObjectClass has two specializations,RigiClass and Re�neClass. The former has as instancesclasses that are manipulated by the Rigi tool, whilethe latter has instances that are classes manipulatedby the Re�ne tool. As indicated in Figure 4, thesemetaclasses have attribute metaclasses which identifyattributes used by Rigi and ones used by Re�ne.At the class level, the class File is de�ned as an in-stance of both RigiClass and Re�neClass. This declaresthat a �le object may be shared (and manipulated)both by Rigi and Re�ne. File has many attributes clas-si�ed under one or more attribute metaclasses. Eachof these attribute metaclasses de�ne capabilities (at-tributes) speci�c to their respective tool. For exam-ple, the attribute metaclass rigiAttribute, when usedin the class File, serves to group together all the Rigiattributes. In the Re�ne case, we have sub-dividedits attributes into two classes, mirroring the way Re-�ne represents its data (either as a tree or non-treeattribute). Having done this we are now able to easilyand e�ciently \�lter" objects so that, if desired, eachtool gets only the attributes it is interested in.

Figure 4. The File class description4 The Control Rule Set and Rule Exe-cution EngineThe coordinator is a set of tools and mechanismsthat enables an active monitoring, coordination and co-operation among systems. The coordinator is designedto provide services while enabling the required levelof local autonomy. It does not require full data inte-gration, although the more integrated the cooperativeinformation system is, the better the performance ofthe coordinator would be. The coordinator uses rulesto de�ne cooperation among information systems. Itsintellectual origins resides in the active database re-search area, and it uses latest advances in this researcharea [14] to generate a mechanism that maintains therelationships among information systems in a cooper-ative environment. This section presents the controlrule set and its use in the cooperative information sys-tem architecture environment (Section 4.1), and theexecution engine for rules (Section 4.2).4.1 RulesRules are programming constructs which make itpossible to specify the activation of services in re-sponse to event occurrences. In accordance with activedatabase conventions [10], each rule consists of threeparts, namely an event, a condition and an action. Anevent is a logic formula that consists of system events

Source code

Groupedcomponents

Straucturedgraph

Abstracr syntax tree Abstracr syntax tree

Parse Parse

CBCanalysis

SGanalysis

re ri

re ri

Independent analysis scenario

Straucturedgraph

Sequential analysis scenario

Source code

Abstracr syntax tree Abstracr syntax tree

Parse Parse

SGanalysis

re ri

re ri

CBCanalysis

Groupedcomponents

Figure 5. Possible cooperation scenarios be-tween Refine and Rigiand messages. Events can capture an occurrence ex-ternal to the system as well as the starting and endingof services. Each rule is associated with one or moreevent types. A condition is a Boolean formula thatconsists of data values, constants, etc. An action is aset of services to be activated, each service with theappropriate parameters.For example, consider again the reverse engineeringcase study. While each system can perform the parsingand analysis of a source code independently, the combi-nation of the analysis results of both systems providesthe user with a better understanding of the source code.Figure 5 presents two possible scenaria of cooperationamong the two tools. Data elements are representedusing rectangles, and operations are represented usingcircles. Parsere and CBC-Analysis are the operationsassociated with Re�ne, while Parseri and SG-Analysisare the operations associated with Rigi. The indepen-dent analysis scenario represents a scenario where eachtool analyzes and stores its own data. The sequentialanalysis scenario represents a scenario where Rigi gen-erates a structure graph based on the analysis of Re�neand its own. In the latter scenario, the results of CBC-Analysis is combined with Rigi's source code represen-tation. Consequently, nodes that are considered clonescollapse into a single node in the graph forming mod-ules. Following this scenario, links in the graph rep-resent multiple calls between subsystems to generatea more abstract system decomposition. This scenariocan be easily achieved by combining Rigi's and Re�ne'sanalysis capabilities, through the use of two rules, asgiven in Figure 6. Whenever a Rigi-Parses event oc-curs, Re�ne parses the source code and performs theCBC-Analysis service. One result of this analysis isthe instantiation of the Grouped-Components-Updatedevent. This event, combined with the AST-updatedevent, activates the SG-Analysis.

Rigi Event: Re�ne-Grouped-Componentsupdated (Re�ne-AST)and AST updated (Rigi-AST)Condition: noneAction: SG-Analysis (Rigi-AST,Grouped-Components)Re�ne Event: Rigi-Parses (Source-Code)Condition: noneAction: Parsere (Source-Code);CBC-Analysis (Re�ne-AST)Figure 6. Rules of the case studyThe use of the ECA (Event-Condition-Action)model enables us to make use of existing executionmodels for activating rules. In the case of coopera-tive information systems, the use of an explicit eventclause is essential. Since the event is generated by theinformation system, while the condition is veri�ed bythe coordinator, it is important to di�erentiate the con-dition clause from the event clause.

4.2 The Rule Execution EngineRules are triggered by events. An event detectionresults in the evaluation of a condition. If a conditionis evaluated to be true, the services in the action partof the rule are activated. If there is no condition foractivating a rule, the action part is activated as a directresult of an event detection. For example, the detectionof the event Rigi-Parses results in the direct activationof the �rst service Parsere.The rule execution engine in the CoopWARE envi-ronment di�ers from the rule execution engine of anactive database. While an active database performs allof the operations of the action part of the rule, the co-ordinator activates services of the related informationsystems. Consequently, the relationships among ser-vices is far more vague than the relationships amongactions in an active database. A well de�ned semanticsfor these relationships is currently devised, based onprevious works (e.g. [12]) to support priorities amongservices. The relationships between a detection of anevent and an evaluation of a condition, and the rela-tionships between an evaluation of a condition and anactivation of an action generate a special case of inter-transaction relationships that de�ne the atomicity ofthe operations within CoopWARE. An algorithm is de-vised to generate and maintain a correct set of inter-related atomic transactions based on the relationshipsamong events and services. A similar approach wastaken in transaction models dealing with long transac-tions (e.g. [20]).

Services & Events

TMB

mbus

Unix sockets

Figure 7. Coordinator implementation layers5 Implementation and ExperiencesThis section presents some implementation issues(Section 5.1) and some of the experiences in the frame-work of the reverse engineering case study (Section5.2.)5.1 ImplementationThe Information Repository (IR) has been im-plemented in C++ and uses ODI's object-orienteddatabase ObjectStore for persistence. The IR imple-ments a Telos language interpreter for data modelingand o�ers services such as object creation, deletion andupdate, also object retrieval through a query facility.The IR is currently being re-implemented as a main-memory version using at �les for persistence.Turning to the implementation of the coordinator,we have implemented the coordinator layer architec-ture from the TMB (Telos Message Bus) layer down(Figure 7) and are currently in the process of imple-menting the Services & Events layer. The TMB layer,implemented in C++, is an extensible message serverthrough which all tools can communicate, both withthe coordinator and with each other, using the com-mon schema. These messages form the basis for allcommunication in the system. The message server hasbeen implemented on top of mbus, an existing publicdomain software bus technology [6]. This, in turn, isdependent on the presence of Unix software sockets;these protocols have been ported to almost all mod-ern computer platforms, including Dos/Windows andApple machines. Since the basis for integration is thiscommunication, our architecture can incorporate toolsrunning in heterogeneous environments.The TMB o�ers message-passing using message ob-jects with client facilities for message creation, deletionand archiving. The design goals of the TMB includedextensibility, in the sense that a client can dynamicallyinform the system that it can handle a new kind of

request (for example, when Re�ne implements a newcomplexity measure it would register this capability,and thereafter other tools could make use of it). Animportant requirement is that the TMB support point-to-point as well as broadcast communication so thattools, for example, can send messages to a particulartool or to all tools of a certain type. E�cient transmis-sion of bulk data was deemed to be critical (rather thanobject-by-object retrieval), since that is the intendedmodus operandi for the integration architecture.The implementation of the rule language and exe-cution engine is currently under design. It is basedon work reported in [15], notably the notion of a de-pendency graph, built by analyzing given rules anddata elements. Figure 8 presents a partial dependencygraph for the rules of the reverse engineering case study.Events are depicted as triangles, operations as circles,and data elements as rectangles. Triggering edges con-nect an event with a condition and a condition withan action of the same rule. In case there is no condi-tion part in a rule, a triggering edge connects an eventdirectly with an action. For example, the event Rigi-Parses triggers a rule of Rigi and therefore a triggeringedge connects the event with the �rst action of this ruleParse. Update edges connect actions with the dataelements they update. For example, CBC-Analysisupdates Grouped-Components, resulting in an updateedge from CBC-Analysis to Grouped-Components. Re-quest edges are depicted using dotted lines, and repre-sent data element that a service uses, yet its modi�ca-tion does not trigger the service.The dependency graph is used by the coordinatorat run-time, based on an execution model as given in[14]. The results of events and messages are evalu-ated and any outgoing edges are triggered. For exam-ple, consider the dependency graph as given in Figure8. Whenever a Rigi-Parses event occurs, Re�ne's ruleis activated. Re�ne parses the source code and per-forms the CBC analysis. The result of this analysisis the update of Grouped-Components. This updateis informed through the event Grouped-Components-Updated event. This event, in addition to the AST-updated activates Rigi's rule, which performs the SG-Analysis.The generation of the dependency graph is astraight-forward procedure. The time-complexity ofgenerating a dependency graph is bounded by O(jVj+ jEj). It should be noted that the number of nodesand edges re ects the number of schema elements andthe relationships among them, and not the number ofactual objects in the information systems. Thus, thespace complexity of the graph, which is bounded byO(jVj + jEj) is relatively small. It is also worth noting

LEGEND:

DATA ELEMENT

EVENT

ACTION

TRIGGER/UPDATE RELATION

REQUESTRELATION

Parse

SG-Analysis

re

CBCAnalysis

REFINE-AST

Rigi-AST

Source-Code

Grouped-Components

Structured-Graph

Rigi-Parses(Source-Code)

AST-updated Grouped-

Components-updated

Figure 8. A partial dependency graph of thecase studythat the generation of the dependency graph is carried-out only whenever there is a change in the schema def-initions or changes in operation clauses. The complex-ity of the coordination depends more on the numberof rules than on the number of attributes devised as aresult of the data integration. Therefore, scaling up de-pends on an e�cient implementation of the rule enginerather than on the result of the data integration.

5.2 ExperiencesOur experiences in integrating Re�ne and Rigi wasgained over a three-year period and involved use ofthe integration architecture by three di�erent researchgroups which used Re�ne or Rigi and wanted to in-tegrate their code analysis tools with others, in orderto support more sophisticated reverse engineering sce-naria.We found that data integration involves not only theidenti�cation of commonly used information by di�er-ent tools, but also the identi�cation of a number ofpossible use scenaria for this information. For exam-ple, the analysis that is carried out by combining Rigi'sand Re�ne's capabilities requires a more complex in-formation schema than that used for other purposes.Moreover, the use of browsers and user friendly in-terfaces made it possible to experiment with di�erenttypes of analyses in order to understand the contents ofthe repository. Consequently, a customized repository

browser was built.As claimed, the integrated environment providedmore functionality than each of the individual toolsworking on its own. By combining analyses from thetwo tools we obtained results that would not be easilyobtained by using only one tool. For example, a Re-�ne cluster analysis based on use of common resources,that when imported to Rigi and combined with clustersobtained by analyzing the call graph and data types inthe Rigi environment we generated system views thatare not available by using Re�ne or Rigi separately.Moreover, the integrated system allows for a numberof developers working with di�erent machines and en-vironments to cooperate for the analysis of a softwaresystem, thus leveraging the capabilities and the exper-tise of maintainers at di�erent sites inside an organi-zation. The repository allows for more than one userto access the analysis environment allowing for multi-ple analyses to be performed simultaneously at di�er-ent machines. This is especially important for compu-tationally expensive analyses applied to large systems(>1MLOC).6 Related WorkThe CoopWARE architecture relates to a numberof di�erent research areas. Distributed databases,heterogeneous databases, multidatabases, federateddatabases, multi-view systems, work ows, global infor-mation systems, Internet applications and many othertypes of information systems are considered to be co-operative information systems.Global information systems [24] are systems thatinvolve a large number of information resources dis-tributed over computer networks, with autonomousmaintenance of data. Global information systems aresimilar to multidatabases (e.g. [27]) in their approach.Both system types seek to support global updates,while preserving site autonomy. Distributed databases[7], unlike global information systems, do not have afull autonomy of their data. Issues in structuring het-erogeneous databases are discussed in [4], and oth-ers. Federated databases [28] can be implementedbased on several architectures. For example, feder-ated databases can be implemented using loosely cou-pled systems, where the responsibility for understand-ing the semantics of the global system is transferredto the user. Federated databases can also be imple-mented using tightly coupled systems, where a globalconceptual schema is used to de�ne the semantics ofthe global schema. Research in the work ow area hasbeen in uenced by the concepts of long running ac-tivities [11], multi-system applications [1], polytrans-

multidatabases

federated databases

distributed databasesheterogeneous databases

workflows

global information systemsinternet applicationsmulti-view systems

NONE FULLLOCAL AUTONOMY

HIGH

LOW

DATASTRUCTURE

Figure 9. Distribution of information systemsactions [29], and application multiactivities [18]. ME-TEOR [22] is an example of a project whose objectiveis to support multi-system work ow applications.Having considered the common features of the abovementioned models, we design the integration architec-ture to support common features while enabling dif-ferent levels of autonomy, rigid structure and data in-tegration. Figure 9 presents the distribution of theseinformation system types on a two-axis grid. The hor-izontal axis depicts the level of local autonomy givenin each information system type and the vertical axisrepresents the rigidity level of the information system'sdata structure.The architecture is aimed at serving as an under-lying architecture for systems with di�erent levels oflocal autonomy, di�erent rigidity levels of data struc-ture, and di�erent levels of data integration. As such, itprovides a mechanism for the cooperation among infor-mation systems, without loosing the model generality.Such a model enables research in this area to be morefocused on issues that are common to all the abovementioned systems, such as alternative paths of opera-tions in case of network failures and the application ofmeta-data.MARVEL [2] is a process-centered environment thatsupports teams of users working on medium to largescale projects. An instantiated environment is createdby an administrator who provides the data schema,process model, tool envelopes (which are equivalent tocomponents), and coordination model for a speci�c ap-plication. The process is described in a process mod-eling language. Each process step is encapsulated in arule. The body of a rule consists of a query to bind lo-

cal variables; a logical condition that is evaluated priorto initiating the activity; an optional activity in whichan arbitrary external tool or application program maybe invoked through an envelope; and a set of e�ectsthat each assert one of the activity's alternative results.While CoopWARE use events to generate an automati-cally triggered environment, the MARVEL model doesnot use events as part of the model. In addition, themodel imposes a rigid structure of rules, while Coop-WARE assumes a rule where its last phase (the ser-vices) conclude the liability of the central mechanism,thus enabling a simpler transaction model.ARCHON (Architecture for Cooperative Heteroge-neous On-line Systems) [17] aimed to develop an ar-chitecture, software framework and methodology formulti-agent systems for real-world industrial applica-tions in the area of power system control supervision.The technology is based on a distributed AI approach,where coordination and responsibility are transferredto the components (termed subsystems in ARCHON)rather than being handled by a central mechanism.This approach may require major modi�cations to thecomponents, and therefore far less useful in the envi-ronment of existing softwares.In ARCHON, tasks are grouped together intorecipes, which are production rules, with TRIGGER,ACTIONS and RESULTS. The trigger part combinesevents and conditions. This method, in many cases isless e�ective that the separation of events and condi-tions to di�erent elements, evaluated at di�erent steps.In addition, the recipes requires a long-term control,while rules in CoopWARE enable a exible mechanismwith short term atomicity enforcement.CORBA [9] provides a widely accepted formalismfor specifying process communication in a client-serverenvironment. In the CORBA model objects provideservices, and clients issue requests. Object referencesare used for a client to locate the appropriate servers,so that it can direct its requests to them. The architec-ture proposed in this paper targets the enhancement ofa CORBA by providing a formalism for modeling com-munication in terms of rules that allow for the speci�-cation of data dependencies between processes as wellas high level task scheduling. Moreover, the approachis not tightly coupled to a particular language or imple-mentation (i.e. CORBA/C++) and treats any processor tool as a black box that o�ers in the environment aset of services as well as data via a registered interface.The area of active databases has gained an increas-ing interest in the research community during the lastfew years, resulting in numerous models and proto-types. The leading paradigm in this area is the ECA(Event-Condition-Action) that was proposed in the

framework of the HiPAC project [8]. Many of the ECA-based systems lack global control of an application's setof rules. This may result in non-deterministic behaviorwhich would have disastrous consequences in coopera-tive information systems.An alternative approach to the ECA paradigm wasproposed in the framework of the PARDES model [13]that is aimed at providing a high-level tool for deriveddata using a data-driven invariant language. The ac-tive model, as given in this work, is an extension ofthe PARDES and other similar models [14]. It usesthe analysis and control capabilities of these models togenerate a deterministic, well-understood cooperativearchitecture.7 ConclusionsWe have presented a view of cooperative informationsystems which is based on the premise that such sys-tems consist of legacy components integrated throughan architecture. The paper then proposed a generic ar-chitecture for cooperative information systems calledCoopWARE. The architecture supports facilities fordata integration of the components constituting thecooperative information system, also mechanisms forcoordination and policy enforcement among these com-ponents. The novelty of the proposed architecture liesin the integration of concepts from data modelling, ac-tive databases and organizational systems. In additionto the proposal, the paper reports on a (partial) proto-type implementation of the architecture and some ex-periences in applying it to build an integrated reverseengineering environment.Several issues require further research. Firstly, thedesign of the rule execution engine with an appropriatetransaction mechanism and the implementation on topof the TMB is currently under way. Secondly, we needto study several other cooperative information systemapplications to establish the adequacy of the proposedarchitecture, or to further re�ne the current proposal.Thirdly, we would like to study the co-existence of sev-eral coordinators within one cooperative informationsystem, each of which manages a partial informationrepository and only supports some coordination andpolicy enforcement activities. This will form a dis-tributed CoopWARE environment; we are currentlystudying how this might be applied to Internet servicesand data.References[1] M. Ansari, L. Ness, M. Rusinkiewicz, and A. Sheth.

Using exible transaction to support multi-systemtelecommunication application. In Proceedings of the18th VLDB Conference, Vancouver, british Columbia,Canada, 1992.[2] N. S. Barghouti. Supporting cooperation in the Mar-vel process-centered SDE. In Proceedings of the FifthACM SIGSOFT Symposium on Software DevelopmentEnvironments, pages 21{30, Tyson's Corner, Virginia,Dec. 1992.[3] S. D. Bird. Towards a taxonomy of multi-agent sys-tems. International Journal of Man- Machine Studies,39:689{704, Feb. 1993.[4] Y. Breitbart, H. Garcia-Molina, and A. Silberschatz.Overview of multidatabase transaction management.VLDB journal, 1(2):181{239, 1992.[5] A. Brown et al. Principles of CASE Tool Integration.Oxford University Press, 1995.[6] A. Carroll. ConversationBuilder: A CollaborativeErector Set. PhD thesis, University of Illinois, 1993.[7] S. Ceri and G. Pelagatti. Distributed Databases: Prin-ciples and Systems. Mcgraw-Hill, 1984.[8] U. Chakravarthy. Rule management and evaluation:An active DBMS perspective. ACM SIGMOD Record,18(3):20{28, Sep 1989.[9] W. Cook. Application integration, not application dis-tribution. In ACM OOPS Messenger, Addendum tothe Proceedings of OOPSLA 1993, pages 70{71, Apr.1994. Published as ACM OOPS Messenger, Adden-dum to the Proceedings of OOPSLA 1993, volume 5,number 2.[10] U. Dayal, A. Buchmann, and D. McCarthy. Rules areobjects too: A knowledge model for an active object-oriented database model. In Proceedings of the 2nd In-ternational Workshop on Object-Oriented Databases,pages 140{149, Sep 1988.[11] U. Dayal, M. Hsu, and R. Ladin. A transactionalmodel for long-running activities. In Proceedings ofthe 17th VLDB, pages 113{122, Sep 1991.[12] O. Etzion. Active interdatabase dependencies. Infor-mation Sciences, 75:133{163, 1993.[13] O. Etzion. PARDES | a data-driven oriented activedatabase model. SIGMOD RECORD, 22(1):7{14, Mar1993.[14] A. Gal. TALE | A Temporal Active Language andExecution Model. PhD thesis, Technion|Israel Insti-tute of Technology, Technion City, Haifa, Israel, May1995. Available through the author's WWW homepage, http://www.cs.toronto.edu/�avigal.[15] A. Gal, O. Etzion, and A. Segev. Tale | a tempo-ral active language and execution model. In Proc.CAiSE'96, Crete, Greece, May 1996.[16] M. Hammer and J. Champy. Re-Engineering theCorporation: A Manifesto for Business Revolution.HarperCollins Publishers, 1993.[17] N. Jennings, L. Varga, R. Aarnts, J. Funchs, andP. Skarek. Transforming standalone expert systemsinto a community of cooperating agents. Engineer-ing applications of Arti�cial Intelligence, 6(4):317{331, 1993.

[18] L. Kalinichenko. A declerative framework for captur-ing dynamic behavior in heterogeneous interoperableinformation resource environment. In Proceedings ofthe 3rd RIDE International Workshop on Interoper-ability in Multidatabase Systems (IMS'93), Apr 1993.[19] K. Karlapalem, Q. Li, and C. Shum. Hodfa: An ar-chitecture framework for homogenizing heterogeneouslegacy databases. SIGMOD RECORD, 24(1):15{20,Mar 1995.[20] J. Klein and F. Upton IV. Constraint based trans-action management. IEEE Data Eng. Bull., 16(2):20,June 1993.[21] G. Kotik and L. Markosian. Automating softwareanalysis and testing using a program transformationsystem. Technical report, Reasoning Systems Inc.,1989.[22] N. Krishnakumar and A. Sheth. Specifying multi-system work ow applications in meteor. Technical Re-port TM-24198, Bellcore, May 1994.[23] S. Laufmann, S. Spaccapietra, and T. Yokoi, editors.Proceedings Third International Conference on Coop-erative Information Systems, Vienna, May 1995.[24] A. Levy, D. Srivastava, and T. Kirk. Data modeland query evaluation in global information systems.Journal of Intelligent Information Systems, 5:121{143,1995.[25] H. Muller, B. Corrie, and S. Tilley. Spatial and visualrepresentations of software structures. Technical Re-port Tech. Rep. TR-74. 086, IBM Canada Ltd., Apr.1992.[26] J. Mylopoulos, A. Borgida, M. Jarke,and M. Koubarakis. Telos: Representing knowledgeabout information systems. ACM Transactions on In-formation Systems, Oct. 1990.[27] M. Rusinkiewicz et al. OMNIBASE: Design and im-plementation of a multidatabase system. In Proceed-ings 1st International Conference on Parallel and Dis-tributed Processing, pages 162{169, Dallas, TX, May1989.[28] A. Sheth and J. Larson. Federated database sys-tems for managing distributed, heterogeneous, andautonomous databases. ACM Computing Surveys,22(3):183{236, 1990.[29] A. Sheth, M. Rusinkiewicz, and G. Karabatis. Us-ing polytransactions to manage interdependent data.In A. Elmagarmid, editor, Transaction Models forAdvanced Database Applications, chapter 14. MorganKaufmann, 1992.[30] G. Valetto and G. Kaiser. Enveloping sophisticatedtools into computer-aided software engineering envi-ronments. In Proceedings of the 7th InternationalWorkshop on Computer-Aided Software Engineering(CASE'95), pages 40{48, 1995.[31] M. Whitney et al. Using an integrated toolset forprogram understanding. In Proceedings of the CAS-CON'95, pages 262{274, Nov. 1995.

A Generic Integration Architecture for Cooperative Information Systems

Documents

Transcript of A Generic Integration Architecture for Cooperative Information Systems