Work ows, Transactions, and Datalog - CiteSeerX

16

Transcript of Work ows, Transactions, and Datalog - CiteSeerX

Work ows, Transactions, and DatalogAnthony J. BonnerUniversity of TorontoDepartment of Computer ScienceToronto, Ontario, Canada M5S 1A4web: www.cs.toronto.edu/~bonneremail: [email protected]: (416) 978-7441fax: (416) 978-4765November 26, 1998AbstractTransaction Datalog (abbreviated T D) is a concurrent programming language thatprovides process modeling, database access, and advanced transactions. This paperillustrates the use of T D for specifying and simulating work ows, with examples basedon the needs of a high-throughput genome laboratory. In addition to database support,these needs include concurrent access to shared resources, synchronization of work,and networks of cooperating work ows. We also use T D to explore the computationalcomplexity of work ows in data-intensive applications. We show, for instance, thatwork ows can be vastly more complex than database transactions, largely because con-current processes can interact and communicate via the database (i:e:, one process canread what another one writes). We then investigate the sources of this complexity, fo-cusing on features for data modeling and process modeling. We show that by carefullycontrolling these features, the complexity of work ows can be reduced substantially.Finally, we develop a sublanguage called fully bounded T D that provides a practicalblend of modeling features while minimizing complexity.

1 INTRODUCTION 11 IntroductionThe management of work ows and business processes is a ubiquitous task faced by many orga-nizations in a wide range of industries [31, 37, 23, 48]. The problem is to coordinate the variousactivities involved in a complex process, such as trip planning, student registration, telephone instal-lation, laboratory testing, and loan application processing. Because they can access and generatelarge volumes of data, many work ows require database support, including both data modeling andtransaction management. However, because the demands of work ows are more complex than thoseof traditional database applications, conventional data management techniques are not enough. Asone prominent researcher puts it, \The next big challenge for the information technology industryis the management and automation of business processes" [31]. This paper focuses on one aspectof this challenge: high-level languages for specifying work ow. In particular, we investigate the ef-fects of data-oriented and process-oriented features, especially their in uence on the computationalcomplexity of work ows.Background: The need for database support in work ow management shows up in several ways.For instance, declarative queries and updates are needed both for accessing application data, andfor recording and monitoring the history of work ow execution. In fact, this need is so greatin some applications that database benchmarks have been developed based on the queries andupdates produced by high-volume work ows [22, 20]. In addition, since work owsmay access sharedpersistent data, transactions are required. This presents a challenge since work ows can also createdependencies between transactions. For instance, depending on the work ow, if one transaction failsor commits, then others may have to be started or aborted [7, 35, 15]. Such dependencies are notaccounted for in the classical theory of transaction management [12]. The result is that \businessprocess management cannot be handled by means of conventional database techniques alone" [31].For this reason, there has been considerable research on advanced transaction models [35] and theirapplication to transactional work ow [47, 67, 61].In addition to database support, work ow management requires process modeling. A processmodel describes the ow of control among the various activities that make up a work ow. Forinstance, in a simple linear work ow, the process model simply lists the order in which the activitiesmust be carried out. In general, a process model can be highly complex, and can specify thatsome activities be carried out concurrently, that some be synchronized, that some be executedrepeatedly, and that some be invoked as subprocesses of others. Numerous formalisms have beendeveloped for this purpose, including various process algebras [56, 45, 43, 11, 8, 57], many kinds ofPetri net [60, 51, 52, 33, 5], as well as temporal logic [28], state charts [39], concurrent transitionsystems [41], and concurrent logic programming [62]. These formalisms all allow concurrentlyexecuting processes to interact and communicate, and some (such as process algebras) allow newprocesses to be created recursively at runtime.Computational problems associated with work ow have been extensively studied by the process-modeling community. This includes complexity results for problems such as veri�cation, liveness,deadlock detection, and goal reachability (e:g:, [41, 27, 10, 36, 28, 53, 58]). However, most of thiswork focuses on control ow, and either ignores data entirely, or uses a highly simplistic model ofdata (such as tokens in Petri nets [60]).1 This leaves many questions unanswered. For instance, whatis the in uence of data-oriented features (such as queries and updates) on work ow complexity?How does the combination of data modeling and process modeling a�ect complexity? How doescomplexity depend on database size? What is the a�ect of transactional features? This paperbegins to address these questions.1An exception is a kind of Petri net known as predicate/transition nets [52]. However, to our knowledge, therehas been no analysis of these nets comparable to that presented in this paper.

1 INTRODUCTION 2Our vehicle for this investigation is Transaction Datalog (abbreviated T D), a concurrent pro-gramming language that provides both process modeling and database support [15, 17, 13]. T Dhas many of the features of process algebras. These include concurrent access to shared resources,communication between sequential processes, and the ability to isolate (or hide) the inner workingsof a group of processes from the outside world. Like all process algebras, T D is compositional, soprocesses can be de�ned recursively in terms of subprocesses. It is therefore possible to specify so-called multi-level processes [31], even when the number of levels is determined at runtime. However,unlike process algebras, T D also provides high-level support for database functions. These includedeclarative queries, bulk updates, views, and serializability [18, 16, 17]. T D also has many featuresof advanced transaction models, including subtransaction hierarchies, relaxed ACID requirements,and �ne-grained control over abort and rollback [15]. This integration of process modeling anddatabase functionality is re ected in the formal semantics of T D, which is based on both databasestates and events, while the semantics of process algebras is based entirely on events. Papers onthe formal semantics of T D, a prototype implementation, and the results of benchmark tests areavailable at: http://www.cs.toronto.edu/~bonner/transaction-logic.htmlComputational Complexity: As mentioned above, most computational studies of work owand process models focus on problems such as veri�cation, deadlock detection, goal reachability,etc. Unfortunately, while many of these problems are decidable for �nite state machines or forpropositional process models, they are fully undecidable (outside of RE) for many realistic work- ows, where the universe of database states is in�nite. To deal with this problem, some databaseresearchers have chosen to restrict the process model instead of the data model [4, 30, 66]. Unfor-tunately, the result, once again, is that many realistic work ows are not considered.In this paper, we take a di�erent approach and address a di�erent set of questions. Our goal is tostart with a highly expressive work ow language (like T D), and pinpoint the sources of complexity,including the tradeo� between data-oriented and process-oriented features. We do not attempt toreason about undecidable properties of work ows. Instead, we measure what might be called the\data complexity" of a work ow. Speci�cally, we treat a work ow as a concurrent program thataccesses a database. The process starts from an initial database state, updates the database as itexecutes, and leaves the database in some �nal state when it terminates (if it terminates). In thisway, the work ow de�nes a partial mapping from initial to �nal states, just as a Turing machinedoes. More generally, non-deterministic processes de�ne a binary relation on states. We de�nethe data complexity of a work ow to be the complexity of this relation. With this approach, wecan measure the complexity of executing a single work ow instance, or the complexity of spawningmany concurrent (and possibly interacting) instances for processing a set of work items. Section 3gives examples of both.The complexity of database transactions has been de�ned in the same way [3, 1]. However,work ows can be much more complex than traditional transactions. For instance, Section 4.1shows that T D can simulate an arbitrary Turing machine, even though it has none of the featuresthat normally lead to this kind of power in a database language. Speci�cally, T D does not expandthe data domain or the database schema during program execution. Transaction languages with thisproperty are said to be safe, and typically their data complexity is complete for PSPACE [2, 3, 1, 25].In contrast, the data complexity T D is complete for RE. This dramatic increase in complexity, fromPSPACE to RE, is due entirely to interactions between processes.Section 4 studies the sources of this complexity in more detail by developing a family of syntacticrestrictions. Most of these restrictions are simple and eliminate a single modeling feature, such asqueries, updates, concurrency, or recursion. We show that these restrictions reduce the complexityof T D to various levels, including EXPTIME, PSPACE, PTIME and LOGSPACE. These resultspinpoint the precise e�ect on complexity of particular modeling features. However, each feature

2 OVERVIEW OF TRANSACTION DATALOG 3plays an important role in de�ning work ows, and in practice, we do not want to entirely eliminateany one of them. With this in mind, Section 5 develops a more complex restriction, called fullboundedness, that provides a practical blend of modeling capabilities, and is complete for NP. Ofcourse, we would prefer a restriction whose complexity is in PTIME, but cooperative concurrency2is inherently non-deterministic, and leads very quickly to NP-completeness.In general, our complexity results are in line with and improve upon related work in the liter-ature. For instance, like T D, most process algebras can simulate arbitrary Turing machines [65].Moreover, Harel has shown that cooperative concurrency increases the complexity of many prob-lems by an exponential, even when the number of processes is carefully bounded [41, 42, 40]. Thus,since safe transaction languages are typically PSPACE-complete, one might expect that adding abounded number of concurrent processes would increase their complexity to EXPSPACE. In thislight, our syntactic restrictions are very e�ective at keeping complexity down. Additional compar-ison with related work can be found in the long version of this paper on the Web [14].2 Overview of Transaction DatalogTransaction Datalog is a fragment of Concurrent Transaction Logic (abbreviated CT R), which wedeveloped in previous work [17, 13]. CT R is an extension of classical logic that seamlessly integratesconcurrency and communication with queries and updates. It has a purely logical semantics,including a natural model theory and a sound-and-complete proof theory. Like classical logic, CT Rhas a \Horn" fragment with a procedural interpretation, in which programs can be speci�ed andexecuted. T D is based on the Horn fragment of CTR, just as classical Datalog is based on the Hornfragment of classical logic. This section reviews the syntax of T D and its procedural interpretation,summarizing material from [15]. We adopt much of the terminology of deductive databases.T D provides three operators for combining simple programs into more complex ones: sequentialcomposition, denoted ; concurrent composition, denoted j; and a modality of isolation, denoted�. In addition, logical rules provide a subroutine facility, exactly as in classical Datalog. Thefollowing de�nition makes the syntax more precise.De�nition 2.1 (Syntax) An atomic formula is a goal. If �1; :::; �k are goals, then so are�1 � � � �k, �1 j � � � j �k and ��1. If � is a goal and p is an atom, then p �, is a rule. A�nite set of rules is a rulebase. A rulebase together with a goal is a program. 2Intuitively, goals are procedures, and rules are subroutine de�nitions (in the logic-programmingtradition). In particular, if � and � are goals, then� � � means \First execute �, then execute �, and commit i� both � and � commit."� � j � means \Execute � and � concurrently, and commit i� both � and � commit."� �� means \Execute � in isolation, and commit i� � commits."� p � means \An execution of � is also an execution of p, where p commits if � commits."A program executes in isolation if it does not communicate or interact with other programs. Isola-tion is a fundamental property of database transactions [12, 38], and is closely related to serializabil-ity. For instance, if t1, t2, ..., tn are database programs, then the goal �t1 j �t2 j � � � j �tn executesthem serializably. Isolation also plays an important role in de�ning sub-transaction hierarchies, asdescribed below.2In cooperative concurrency, processes can communicate, synchronize, or otherwise cooperate [40]. It is thedominant form of concurrency in distributed systems and process modeling [34, 44, 55], and is form of concurrencyconsidered in this paper. In contrast, the massive parallelism that is possible in evaluating some database queries isnon-cooperative.

2 OVERVIEW OF TRANSACTION DATALOG 4Using the four operators above, a T D programmer combines elementary operations into com-plex processes. In general, an elementary operation can be any activity that accesses a database,including activities that require human intervention (as in many work ows). Examples includesimple database updates, complex application programs, and legacy systems. Semantically, anelementary operation is treated as a black box. This idea is formalized in [17].In general, the complexity of a T D program depends on its elementary operations. However, wewould like to factor out these operations in order to focus on the complexity of T D itself. For thisreason, we base our complexity analysis on a small set of simple operations. This does not result inany loss of expressiveness or computational power, since with these operations, T D is expressivelycomplete, i:e:, it can express any computable database transaction (Theorem 4.4). These operationsare also minimal, since if any one of them is removed, then expressive completeness is lost.3In this paper, we use four kinds of elementary operation, which we denote by four kinds ofatomic formula: p(x), p.empty, ins.p(x), del.p(x), where p is a base predicate symbol. Theformal semantics of these formulas is given in [17, 15]. Intuitively, the �rst two formulas are yes/noqueries, and the last two formulas are updates. In particular,� p(x) means \Commit i� p(x) is in the database."� p.empty means \Commit i� the database contains no atoms of the form p(x)."� ins.p(x) means \Insert atom p(x) into the database, and commit."� del.p(x) means \Delete atom p(x) from the database, and commit."Transaction Programs. T D can be used to program database transactions. As a simple ex-ample, the goal del.p(b) ins.q(b) represents a program that �rst deletes p(b) from the database,and then inserts q(b). This program always commits since its components, del.p(b) and ins.q(b),always commit. The goal p(b) del.p(b) ins.q(b) represents a similar program, but with pre-conditions. This program �rst asks if p(b) is in the database, and if so, it deletes p(b) andthen inserts q(b). This program commits if and only if p(b) is in the database just before theupdates occur. By using rules, a programmer can de�ne subroutines. For instance, the ruler(X) p(X) del.p(X) ins.q(X) de�nes a subroutine with name r and parameter X . Using bas the parameter value, r(b) commits if p(b) is in the database just before the updates occur.Advanced Transactions. As shown in [15], T D accounts for many basic properties of \ad-vanced" transaction models [35], including nested transactions. These properties include subtrans-action hierarchies, non-vital subtransactions, relative commit, and partial rollback. For example,the goal �(�1 j �(�2 j ��3)) ia an isolated transaction, which contains an isolated subtransac-tion, �(�2 j ��3), which contains an isolated sub-subtransaction, ��3. As another example, inthe transaction �1 �2, the commit of �1 is relative to the whole transaction: if �2 aborts, thenthe whole transaction aborts and is undone; in particular, �1 is undone, even though it has alreadyterminated and committed. Likewise for �1 j �2.Declarative Queries. When a T D program contains no updates, it de�nes a read-only transac-tion, i:e:, a query. As shown in Section 4.1, for such programs, the connectives and j both reduceto classical conjunction, and Transaction Datalog reduces to classical Datalog. For instance, thegoals p(X; Y ) q(Y; Z) and p(X; Y ) j q(Y; Z) both express the join of relations p and q. Likewise,the rule r(X;Z) p(X; Y ) j q(Y; Z) expresses the projection of this join onto two attributes. Anyclassical Datalog query can be expressed in this fashion.3Another approach would be to prove relativized complexity results [59, 9] with elementary operations modeledby a variation of oracle Turing machines. All of the results in this paper can be relativized in this way, althoughadditional formal machinery is required. Due to space limitations, we do not develop this approach here.

3 EXAMPLES 53 ExamplesLike classical logic programs, T D has both a declarative semantics and a procedural interpretation.These have been published in detail elsewhere [15, 17, 16], and are reviewed in a long version ofthis paper, which is available on the Web [14]. This section illustrates the procedural interpretationthrough a series of examples.4 Each example has been tested on our prototype implementation [49,63], and performs exactly as described below.The examples focus on so-called production work ow, which forms the core of a business orenterprise [37, 54]. Production work ows are typically complex, well de�ned, high volume, andmission critical. Many production work ows are organized around work items of some kind, whichthe work ow activities operate on. Examples of work items include insurance claims, loan appli-cations, and laboratory samples. Because they process large numbers of work items, productionwork ows are often data intensive. A concrete example is the laboratory work ows used at theWhitehead Institute/MIT Center for Genome Research [21, 22, 64], which is engaged in severallarge-scale genome mapping and sequencing projects [29]. Each project involves the completion oftens of millions of experiments, organized into a network of factory-like production lines. Coordi-nating the ow of materials through the production lines, and recording and querying the historyof experimental steps and the results they produce are the main data and work ow managementrequirements [21]. The work ows at the Genome Center are data intensive. In fact, as laboratoryautomation increased, database performance became a bottleneck in work ow throughput, and wehad to develop a work ow/database benchmark to evaluate new storage managers for their labora-tory information system [22, 20, 21]. To make the examples in this paper more concrete, we shallsometimes describe them in terms of genome laboratory work ow.Example 3.1 (Work ow Speci�cation) The rules below de�ne a simple work ow made up ofa collection of tasks and a sub-work ow. The tasks execute sequentially and concurrently, andthe ow of work in the sub-work ow depends on a database query. Intuitively, the predicateworkflow(W ) represents the ow of work for a single work item, W . Likewise, the predicatesubflow(W ) represents the sub-work ow, and the predicate taski(W ) represents a work ow task.work ow (W ) task1(W ) [task2a(W ) j task2b(W )] task3(W ) sub ow(W )sub ow(W ) p(W ) task4(W ) task5(W )sub ow(W ) q(W ) task6(W ) task7(W )The �rst rule says the following: �rst task1 should be applied to W ; then task2a and task2b shouldbe applied concurrently; then task3 should be applied; and �nally W should be passed to a sub-work ow for further processing. The sub-work ow applies a series of tasks to W depending on theoutcome of a test: if p(W ) is true, then task4 and task5 are applied; but if q(W ) is true, then task6and task7 are applied.5 Note that p and q are arbitrary database queries, and may be de�ned by acomplex set of T D rules. Also note the synchronization implicit in the �rst rule, since task3 cannotstart until both task2a and task2b have �nished. 2Example 3.2 (Work ow Simulation) The rules below simulate the execution of a work owon a set of work items. They assume that the predicate workflow(W ) has been de�ned, as inExample 3.1. The rules simulate the pipelining found in many work ow systems: items enter thework ow sequentially, but are processed by the work ow concurrently. We assume that a recordidentifying each work item is initially stored in a database relation called item, which acts as an4A version of Example 3.4 originally appeared in [17]. With this one exception, none of the examples in this paperhave been published before.5If both tests are true, then the sub-work ow chooses one series of tasks non-deterministically. If neither test istrue, then the sub-work ow waits until one of them becomes true.

3 EXAMPLES 6\in basket" for the work ow. The �rst rule recursively removes one work item after another fromthe in basket. After each removal, the rule spawns a work ow instance to process the item, wheredi�erent instances of the work ow execute concurrently. The second rule terminates the recursion(and the simulation) when there are no work items left to process.simulate getItem(W ) [simulate j work ow (W )]simulate item.emptygetItem(W ) �[item(W ) del.item(W )] 2In Example 3.2, many instances of the work ow execute concurrently. Normally, however, thereare physical limits on the number of work ow instances that may be active at one time. Typically,each task in a work ow is performed by an \agent," (e:g:, a machine or a person), only a �xednumber of agents is available, and only quali�ed agents can be assigned to each task [37]. In e�ect,the agents are resources that must be shared by the various work ow instances, thus limiting thenumber of instances that can be active at one time. For this reason, an important part of work owspeci�cation is assigning agents to tasks [37]. This is easily accomplished in a language like T Dthat integrates databases and processes, since we can record the status of agents in the database,and update them as the work ow progresses. The next example shows one way of doing this.The example also suggests how to keep track of work that has been performed. This allows formonitoring, tracking and querying the status of work ow activities, another important aspect ofwork ow management [32, 37, 22].Example 3.3 (Shared Resources) The rules below re�ne the predicate taski(W ) used in Ex-ample 3.1 to account for the resources needed to execute the task. In this case, the resources arequali�ed agents. The rules assign an agent, A, to carry out taski on work item W . The rules accesstwo base predicates: qualifiedi(A), which means that agent A is quali�ed to carry out taski; andavailable(A), which means that agent A is currently not working on any other task. In addition,after agent A has completed the task, the atom donei(A;W ) is inserted into the database, whichprovides a record of the work performed.taski(W ) requesti(A) doTaski(A;W ) release(A) ins.donei(A;W )requesti(A) �[qualifiedi(A) available(A) del.available(A)]release(A) ins.available(A)The �rst rule requests an agent, A, quali�ed to perform taski. When such an agent becomesavailable, the rule invokes the predicate doTaski(A;W ), which represents agent A carrying outtaski on item W . When the task has been completed, the rule releases the agent from service,and inserts a record of the work into the database. The second rule selects the agent requested bythe �rst rule. It retrieves an agent, A, who is quali�ed to perform taski and who is available. (Ifno such agent is currently available, then the process waits until one becomes available.) The rulethen deletes A from the pool of available agents. The third rule simply returns A to the pool. 2In many applications, a work ow is made up of sub-work ows that run concurrently and syn-chronize themselves. This is often the case when a work item is made up of several parts, whereeach part is processed by a di�erent sub-work ow. Since the parts are related, the work ows mayhave dependencies between them. Typically, one work ow needs information produced by anotherwork ow, and may have to wait for this information to become available before it can continue.This is the case, for instance, in the work ow described in [22], in which the work items are DNAsamples, and the purpose of the work ow is to construct a physical genome map.66This particular work ow consists of two concurrent sub-work ows that synchronize themselves at several points.One sub-work ow processes DNA samples called clones, and the other processes shorter samples called tclones. Atseveral points, the clone work ow needs information generated by the tclone work ow.

4 DATA COMPLEXITY 7Example 3.4 (Communication and Synchronization) The rules below specify a work owwith two sub-work ows that execute concurrently. Each sub-work ow executes a sequence of tasks.Moreover, the two sub-work ows are not independent: each contains a task that cannot executeuntil the other sub-work ow reaches a certain point.workflow(W ) subflow1(W ) j subflow2(W )subflow1(W ) task1a(W ) task1b(W ) done2b(W ) task1c(W )subflow2(W ) task2a(W ) done1a(W ) task2b(W ) task2c(W )The �rst rule simply splits the main work ow into two sub-work ows, denoted subflow1 andsubflow2, each consisting of three tasks. As in Example 3.3, we assume that upon completion,each task inserts an atom into the database, as a record of its activity. Speci�cally, we assume thattaski(W ) inserts the atom donei(W ). Observe that both sub-work ows contain atoms of the formdonei(W ). When these atoms appear in a goal (as they do here), they represent transactions thatcannot commit until the atom is true. Thus, donei(W ) will not commit until taski(W ) has �nished.The execution of the work ow is therefore subject to two constraints. First, task1a(W ) must �nishbefore task2b(W ) can start. This is because subflow2 will wait at the atom done1a(W ) until itis true. Likewise, task2b(W ) must �nish before task1c(W ) can start. This is because subflow1will wait at the atom done2b(W ) until it is true. In e�ect, inserting the atom donei(W ) into thedatabase sends a synchronization message from one sub-work ow to the other. 24 Data ComplexityThe rest of this paper establishes the data complexity of T D, and investigates the sources of thiscomplexity, focusing �rst on data-oriented features, and then on process-oriented features. (Thenext section considers the interplay of the two.) As discussed in the introduction, we measure thecomplexity of a T D program by treating it as a transaction and observing its e�ect on the database.Recall that a T D program is de�ned by a rulebase, P, and a goal, �. We use the expressionP;D1D2 j= � to mean that the program can transform database D1 into database D2 whenexecuted in isolation. For example, for any rulebase P,P; fabg f g j= del.a del.b P; f g fcdg j= ins.c ins.dP; fabg fcdg j= (del.a del.b) j (ins.c ins.d)Likewise, if P contains the two rules p del.a del.b and q ins.c ins.d, thenP; fabg f g j= p P; f g fcdg j= q P; fabg fcdg j= p j qThe formal semantics of these expressions is given in [15, 17], and is reviewed in the long versionof this paper on the Web [14].Our results are based on a number of standard de�nitions, adapted from [3, 24]. A databaseschema is a �nite set of base predicate symbols (with associated arities). A database with schema Sis a �nite set of ground atoms constructed from the predicate symbols in S. A database transactionof type hS1; S2i is a binary relation on databases of schema S1 and S2. The data complexityof a transaction is the complexity of this relation when viewed as a language. Consider a T Dprogram de�ned by rulebase P and goal �. We say that the program expresses a transaction Tof type hS1; S2i if for any database D1 with schema S1, the pair hD1;D2i 2 T if and only ifP;D1D2 j= �. The data complexity of T D is the data complexity of the most complex transactionexpressed by a T D program. Likewise for the data complexity of a subset of T D. For instance,given a subset of T D, if the most complex transaction is an NP-complete language, then we saythat this subset is data complete for NP.Due to space limitations, no proofs are given in this abstract. However, detailed proofs of sometheorems can be found in the long version of this paper on the Web [14].

4 DATA COMPLEXITY 84.1 Data-Oriented FeaturesRecall that the version of T D considered in this paper has four kinds of elementary databaseoperation: tuple testing, p(x); tuple insertion, ins.p(x); tuple deletion, del.p(x); and emptinesstesting, p.empty. This section shows how these operations a�ect the data complexity of T D, andthat with all four of them, T D is an expressively-complete transaction language. In addition, our�rst result shows that Transaction Datalog is an extension of classical Datalog.Suppose that tuple testing is the only elementary operation in a T D program. Then the programexpresses a read-only transaction, i:e:, a database query. We show that this query language isequivalent to classical Datalog. More speci�cally, any such T D program can be transformed into aclassical Datalog program that expresses the same query. The transformation is simple: remove allmodalities of isolation, and replace sequential and concurrent composition by classical conjunction.For instance, the rule b (c1 c2) j �(c3 c4) in Transaction Datalog is transformed into the ruleb c1 ^ c2 ^ c3 ^ c4 in classical Datalog. Clearly, any classical Datalog program can be generatedin this way, as illustrated in Section 2. The following theorem shows that the transformed programand the original T D program are equivalent.Theorem 4.1 (Relationship to Classical Datalog) Let P be a rulebase, let � be a goal, andlet Pc and �c be their transformed versions as described above. If tuple tests are the only elementaryoperations in P and �, then P;D1D2 j= � if and only if D1 = D2 and Pc [D1 j=c �c, for anypair of databases, D1, D2, where j=c denotes entailment in classical logic.It follows that when tuple testing is the only elementary operation, T D is data complete forPTIME, since classical Datalog is. In addition, Theorem 4.1 suggests a natural optimization.Suppose that a T D goal (or rule body) contains an expression of the form ��, where the predicatesin � are de�ned by rules in which tuple testing is the only elementary operation. Then � can betreated as a classical Datalog query. As such, well-known optimization techniques (such as magicsets or tabling) can be applied.Suppose we augment tuple testing with tuple insertion (but not deletion). This is a natural set ofoperations for many scienti�c work ows, where work ow activities are laboratory experiments thatgather, store and analyze information [6]. This is certainly the case for work ows at the WhiteheadInstitute/MIT Center for Genome Research [21, 64], where experimental results are accumulatedin the database, and queried by analysis programs, but never deleted or altered. Examples 3.1and 3.4 show T D programs in which tuple testing and insertion are the only elementary operations(assuming the basic work ow activities, taski(W ), also satisfy this restriction). Example 3.4 showsthat communication and synchronization are possible under this restriction. The next theoremshows that tuple insertion increases the data complexity of T D from PTIME to PSPACE.Theorem 4.2 With tuple testing and tuple insertion as the only elementary operations, T D isdata complete for PSPACE.Suppose we augment tuple testing and tuple insertion with tuple deletion. This allows storeddata to be altered, instead of just accumulated. In addition, it allows concurrently executingprograms to interact in much more complex ways. In fact, because of these interactions, the datacomplexity of T D skyrockets from PSPACE to RE, as the following theorem shows.Theorem 4.3 With tuple testing, tuple insertion, and tuple deletion as the only elementary oper-ations, T D is data complete for RE.From a database perspective, this result is the most unusual in the paper. This is because T Dhas none of the features that normally lead to RE-completeness in a database language. In particu-lar, it is a safe language that does not generate an unbounded number of tuples during transaction

4 DATA COMPLEXITY 9execution. In contrast, other transaction languages achieve RE-completeness by expanding the datadomain during transaction execution [2, 3, 19], or by expanding the database schema [24]. Typically,the proof of RE-completeness involves encoding the tape of a Turing machine in the database. Insuch approaches, the database grows to arbitrary size during transaction execution, since it encodesa machine tape that grows to arbitrary length. In contrast, T D achieves RE-completeness with a�xed data domain, and a �xed database schema, and thus with databases of polynomial size. Thisdemonstrates a fundamental di�erence between T D and other safe transaction languages, whichare typically PSPACE-complete [2, 3, 1, 25].Our proof of Theorem 4.3 bears some resemblance to the simulation of Turing machines inprocess algebras such as CCS [65]. However, in CCS, the simulation relies heavily on the restrictionoperator [65, 27], which allows a CCS program to create an unbounded number of new (private)communication channels during execution. Since T D is built around databases, not communicationchannels, restriction is not a primitive operation in T D. Instead, our simulation relies on sequentialcomposition (which is not a primitive operation in CCS). In e�ect, to encode a tape storing thestring abcd, we use the sequential goal a b c d. Of course, this is not the whole story, sincein addition to encoding the tape contents, we must be able to read and write the tape. For this,insertion, deletion and concurrency are also required, as shown by Theorems 4.1, 4.2 and 4.5.To close this section, we note that Theorem 4.3 shows that with three of the four kinds of ele-mentary operation, T D can express some RE-complete transaction. Nevertheless, there are manytransactions that it cannot express, even transactions with very low complexity. This is becausewithout the fourth elementary operation (emptiness testing), T D can only express monotonic trans-actions. A monotonic transaction has the following property: if it terminates and commits whenstarted from database D, then it terminates and commits when started from any database con-taining D. In contrast, emptiness testing is a non-monotonic transaction: q.empty succeeds whenrelation q is empty, but fails for all non-empty relations. Thus, even though T D with three ele-mentary operations can simulate an arbitrary Turing machine, it cannot determine whether a baserelation is empty, simply because this is a non-monotonic operation.7 To express non-monotonictransactions, we must add a source of non-monotonicity to the language. Theorem 4.4 shows thatemptiness testing is enough: adding it as a fourth elementary operation enables T D to express allcomputable transactions, both monotonic and non-monotonic, as long as they are safe.8Theorem 4.4 (Expressive Completeness) With all four elementary operations, T D is expres-sively complete for safe transactions, i:e:, it can express any computable safe transaction.4.2 Process-Oriented FeaturesThis section explores the e�ect of concurrency and recursive processes on the data complexity ofT D. (Due to lack of space, we do not explore the e�ects of isolation.) We �rst show that withoutany restrictions on recursion, T D is very sensitive to concurrency, since a little concurrency cancreate a huge increase in complexity. We then develop a restricted form of recursion that is morerobust. It allows iterated work ows to be speci�ed, and concurrency has no e�ect on its complexity.Concurrency is essential to the power of T D. As Theorem 4.5 shows, when concurrent compo-sition is removed, the data complexity of T D plummets from RE to EXPTIME. This version of thelanguage, which we call sequential T D, is comparable to safe transaction languages [2, 3, 1, 25].7This is a common phenomenon in logical languages. For instance, even though classical Horn logic (withoutnegation) is data complete for RE, it cannot compute the di�erence of two base relations, simply because this is anon-monotonic query.8A transaction is safe if it does not expand the data domain. Also, as usual in expressibility results of this kind,we assume that the constant symbols are uninterpreted. Formally, this means extending the notion of genericity fromqueries to transactions, as in [3, 1]. Details are given in the long version of this paper [14].

4 DATA COMPLEXITY 10The main di�erence is that such languages are typically complete for PSPACE, not EXPTIME.The extra power of sequential T D comes froms an ability to simulate alternating PSPACE ma-chines [26]. As shown in the long version of this paper [14], the ability to alternate comes fromthe combination of recursive subroutines and sequential compostion. Other transaction languagestypically have one of these features, but not both.Theorem 4.5 Sequential T D is data complete for EXPTIME.It takes only a little concurrency to achieve RE-completeness. For instance, recursion throughconcurrency is not needed. Nor is it necessary to spawn new processes at runtime. In fact, ex-amining the proof of Theorem 4.3 in [14] shows that there need not be any concurrency in therulebase at all! Concurrency is needed only in the goal. (Recall that a T D program consists oftwo parts: a rulebase, P, and a goal, �.) More speci�cally, it is enough for the goal to have theform �1 j �2 j �3, where each �i is sequential. This means that RE-completeness can be achievedby three sequential processes executing concurrently. In the proof, these three processes are usedto simulate a 2-stack machine [46], where two of the processes encode the stacks, and the thirdprocess encodes the �nite control [14]. We therefore have the following result.Corollary 4.6 T D programs with sequential rulebases are data complete for RE.Restrictions on recursive processes also have a dramatic e�ect on data complexity. As Theo-rem 4.7 shows, if we eliminate recursion altogether, then data complexity plummets from RE to lessthan PTIME. Without recursion, we can still specify work ows like those in Examples 3.1 and 3.4,but we cannot simulate their execution on a set of work items, as in Example 3.2.Theorem 4.7 Non-recursive T D is in LOGSPACE.Together, Theorem 4.7 and Corollary 4.6 establish two extreme points, one of high complexity(RE), and one of low complexity (LOGSPACE). These two extremes correspond to unrestrictedrecursion and no recursion, respectively. Our last result in this section establishes an intermediatepoint (PSPACE). It allows more concurrency than Corollary 4.6, and more recursion than The-orem 4.7. It is based on a syntactic restriction we call sequential tail recursion. There are twoessential elements to this restriction: one corresponds to tail recursion in classical logic program-ming, and the other forbids recursion through concurrent composition.De�nition 4.8 (Sequential Tail Recursion) Let P be a T D rulebase. A rule in P exhibitssequential tail recursion if it has the form p q, where is a goal, and q is the only atom inthe rule body that is mutually recursive with p. The rulebase P exhibits sequential tail recursionif every recursive rule in P does. 2Observe that concurrent composition is allowed in the goal in De�ntion 4.8. This goalmay therefore contain concurrent processes that interact and communicate, like the work ows inExamples 3.1 and 3.4. Tail recursion allows these work ows to be iterated. That is, they canbe executed over-and-over again until some condition is satis�ed. For instance, in a scienti�claboratory, an experimental protocol may be repeated until a conclusive result is achieved. Thisis the case for the genome work ow described in [22]. Note that sequential tail recursion allowsprocesses to be created and destroyed at runtime, but only inside . In particular, the number ofprocesses does not grow with each recursive call, as in the simulation of Example 3.2.Theorem 4.9 T D with sequential tail recursion is data complete for PSPACE.Finally, we show that the complexity of sequential tail recursion is insensitive to concurrency.In fact, the lower bound in Theorem 4.9 does not require concurrent composition at all.Corollary 4.10 Sequential T D with sequential tail recursion is data complete for PSPACE.

5 FULLY BOUNDED T D I5 Fully Bounded T DSection 4 established the e�ect of simple syntactic restictions on the data complexity of work ows.Each of these restrictions targets a single syntactic feature of T D, such as queries, updates, con-currency, or recursion. In practice, however, each of these features has an important role in themodelling of work ows and business processes, so we do not want to entirely eliminate any one ofthem. To address this concern, this section develops a more-complex restriction, called full bound-edness, that has relatively-low complexity, but retains a wide range of modelling capabilites. Dueto space limitations, we can only give a brief and informal description of the restriction here. Aformal development is available in the long version of this paper on the Web [14]. An examplebased on genome laboratory work ow is given in the appendix of this paper.Full boundedness is based on two ideas. First, each recursive call to a predicate must remove atuple from a base relation. For instance, in Example 3.2, each recursive call to simulate removes atuple from the item relation. Second, tuples that are removed from a relation in this fashion mustnot �nd their way back into the relation, as in the following example:p item1(W ) del.item1(W ) task1(W ) ins.item2(W ) pq item2(W ) del.item2(W ) task2(W ) ins.item1(W ) qHere, p moves tuples from item1 to item2, and q moves them back again. Thus, if p and q areexecuted concurrently, there is no guarantee of termination. To eliminate this possibility, we de�nea data ow graph that keeps track of the ow of tuples from one base relation to another at eachlevel of recursion. If this graph is acyclic, then we say that the rulebase is fully bounded.Theorem 5.1 Fully bounded T D is data complete for NP.Full boundedness has a wide range of modeling capabilities. For instance, it is not restrictedto tail recursion, and it allows recursion through sequential and concurrent composition. It alsoallows two-way communication between processes, concurrent access to shared resources, unlimiteduse of isolation, and all four elementary database operations. This means that a wide variety ofpractical work ows can be speci�ed and simulated, including all of the examples in this paper. Inaddition, full boundedness allows separate work ows to be \hooked up" into a network of interactingwork ows, as illustrated in the appendix.A APPENDIX: Work ow NetworksAn enterprise may consist not of a single work ow, but a collection of cooperating work ows. Thisis typically the case when one work ow prepares items needed by another work ow. The resultis often a network of loosely-coupled work ows. For instance, in automobile manufacturing, onework ow may assemble carburetors, another transmissions, another exhaust systems, etc. Theproducts of these individual work ows may then be fed to another work ow that assembles theminto a �nished automobile. Obviously, work ows may be cascaded and combined in this way toproduce a complex network of work ows. Note that in this manufacturing example, the movementof items from one work ow to the next is acyclic. In such cases, the entire network of work owscan be represented by a fully bounded T D program.This is illustrated in the next example, which involves three interacting work ows. Two of thework ows process items, which are then combined and further processed by the third work ow.The example is based on a common situation in large genome laboratories: some work ows preparesamples and reagents, while other work ows determine how they interact. A typical situationwould be the following: one work ow processes long DNA samples, a second work ow processesshort DNA samples, and a third work ow selects pairs of long and short samples that seem likelyto overlap, and processes the two samples together to determine the exact nature of their overlap.

A APPENDIX: WORKFLOW NETWORKS IIExample A.1 (Cooperating Work ows) The rules below simulate three interacting work ows,represented by the predicates workflow1(I), workflow2(J) and workflow3(I; J). Here, workflow1and workflow2 process items which are then combined and processed together by workflow3.However, work items are not handed directly from one work ow to another.9 Instead, workflow1and workflow2 place processed items in \baskets," from which workflow3 then selects suitablepairs of items. A pair of items, I , J , is suitable if the predicate suitable(I; J) is true. This predicateis an arbitrary database query, and may be de�ned by a complex set of T D rules.The �rst set of rules below simulates the execution of the two producer work ows, workflow1and workflow2. As in Example 3.2, a record identifying each work item for workflowi is initiallystored in a database relation called itemi, which acts as an \in basket" for the work ow. In addition,after each work item has been processed, it is inserted into the relation basketi, which acts as an\out basket" for the work ow.producei simulatei ins.finishedisimulatei getItemi(W ) [simulatei j (workflowi(W ) putItemi(W ))]simulatei itemi.emptygetItemi(W ) �[itemi(W ) del.itemi(W )]putItemi(W ) ins.basketi(W )The �rst rule invokes the simulation of workflowi, and inserts the atom finishedi into the databasewhen the simulation is �nished. The second rule carries out the actual simulation. It recursivelyremoves one work item after another from the in basket for workflowi. After each removal, therule spawns a work ow instance to process the item, where di�erent work ow instances executeconcurrently. After the work ow instance terminates, the item is put into the out basket. The thirdrule terminates the recursion (and the simulation) when there are no work items left to process.The next set of rules simulate the execution of workflow3. This work ow has two inputs,basket1 and basket2. The work ow selects a pair of items, one from each basket, and processesthem together. This process is repeated until one of the inputs is permanently empty, at whichtime, the work ow stops.consume simulate3 ins.finished3simulate3 selectItems(I; J) [simulate3 j workflow3(I; J)]simulate3 finished1 basket1.emptysimulate3 finished2 basket2.emptyselectItems(I; J) �[basket1(I) basket2(J) suitable(I; J) del.basket1(I) del.basket2(J)]The �rst rule invokes the simulation of workflow3, and inserts the atom finished3 into the databasewhen the simulation is �nished. The second rule carries out the actual simulation. It �rst selectsand removes a pair of items, I and J , from basket1 and basket2, respectively. After each selection,the rule spawns a work ow instance, workflow3(I; J), to process the pair. The third and fourthrules terminate the recursion (and the simulation) when one of the inputs is permanently empty.Speci�cally, the third rule stops the simulation when workflow1 has �nished and basket1 is empty.Likewise, the fourth rule stops the simulation when workflow2 has �nished and basket2 is empty.The �fth rule does the actual work of selecting and removing a pair of items from the two baskets.Finally, the following rule executes all three work ows concurrently:work produce1 j produce2 j consume 29In this case, the separate work ows could be viewed as a single large work ow.

REFERENCES IIIReferences[1] S. Abiteboul. Updates, a new frontier. InInt'l Conference on Database Theory, pages 1{18, 1988.[2] S. Abiteboul and V. Vianu. Procedural lan-guages for database queries and updates. Jour-nal of Computer and System Sciences, 41:181{229, 1990.[3] S. Abiteboul and V. Vianu. Datalog exten-sions for database queries and updates. Jour-nal of Computer and System Sciences, 43:62{124, 1991.[4] S. Abiteboul, V. Vianu, B. Fordham, andY. Yesha. Relational transducers for electroniccommerce. In ACM Symposium on Principlesof Database Systems, pages 179{187, Seattle,Washington, June 1998.[5] N. Adam, V. Atluri, and W. Huang. Mod-eling and analysis of work ows using Petrinets. Journal of Intelligent Information Sys-tems, 10(2):131{158, March 1998.[6] Standard Guide for Laboratory InformationManagement Systems (LIMS). American Soci-ety for Testing and Materials, 1916 Race St.,Philadelphia PA 19103, U.S.A, 1993.[7] P. Attie, M. Singh, A. Sheth, andM. Rusinkiewicz. Specifying and enforcingintertask dependencies. In Int'l Conference onVery Large Data Bases, Dublin, Ireland, August1993.[8] J.C.M. Baeten and W.P. Weijland. Process Al-gebra. Cambridge University Press, 1990.[9] T. Baker, J. Gill, and R. Soloway. Relativiza-tions of the P ?=NP question. SIAM Journal ofComputing, 4:431{442, 1975.[10] J. Balcazar, J. Gabarro, and M. Santha. Decid-ing bisimilarity is P-complete. Formal Aspectsof Computing, 4(6):638{648, 1992.[11] J.A. Bergstra and J.W. Klop. Algebra of com-municating processes with abstraction. Theoret-ical Computer Science, 37(1):77{121, 1985.[12] P.A. Bernstein, V. Hadzilacos, and N. Good-man. Concurrency Control and Recovery inDatabases. Addison Wesley, 1987.[13] A.J. Bonner. Concurrent transaction logic. Pre-sented at the Dagstuhl Seminar on Transac-tional Work ows, July 15{19 1996, International

Conference and Research Center for ComputerScience, Schloss Dagstuhl, Wadern, Germany.[14] A.J. Bonner. Work ow, transactions, anddatalog. Submitted to PODS'99. Long ver-sion available at http://www.cs.toronto.edu/~bonner/papers.html#transaction-logic.[15] A.J. Bonner. Transaction Datalog: a compo-sitional language for transaction programming.In Proceedings of the International Workshop onDatabase Programming Languages, number 1369in Lecture Notes in Computer Science. Springer-Verlag, 1998. Workshop held in Estes Park, Col-orado, August 1997.[16] A.J. Bonner and M. Kifer. Transaction logicprogramming. In Int'l Conference on Logic Pro-gramming, pages 257{282, Budapest, Hungary,June 1993. MIT Press.[17] A.J. Bonner and M. Kifer. Concurrency andcommunication in transaction logic. In JointInt'l Conference and Symposium on Logic Pro-gramming, pages 142{156, Bonn, Germany,September 1996. MIT Press.[18] A.J. Bonner, M. Kifer, and M. Consens.Database programming in transaction logic. InA. Ohori C. Beeri and D.E. Shasha, editors,Proceedings of the International Workshop onDatabase Programming Languages, Workshopsin Computing, pages 309{337. Springer-Verlag,February 1994. Workshop held on Aug 30{Sept1, 1993, New York City, NY.[19] A.J. Bonner, L.T. McCarty, and K. Vadaparty.Expressing Database Queries with IntuitionisticLogic. In North American Conference on LogicProgramming, pages 831{850, Cleveland, Ohio,October 16{20 1989. MIT Press.[20] A.J. Bonner, A. Shru�, and S. Rozen. Bench-marking object-oriented DBMSs for work owmanagement. In OOPSLA Workshop on Ob-ject Database Behavior, Benchmarks, and Per-formance, Austin, TX, October 15 1995.[21] A.J. Bonner, A. Shru�, and S. Rozen. Databaserequirements for work ow management in ahigh-throughput genome laboratory. In NSFWorkshop on Work ow and Process Automa-tion in Information Systems: State-of-the-Artand Future Directions, pages 119{125, Athens,GA, May 8{10 1996.[22] A.J. Bonner, A. Shru�, and S. Rozen. LabFlow-1: a database benchmark for high-throughputwork ow management. In Int'l Conference

REFERENCES IVon Extending Database Technology, number1057 in LNCS, pages 463{478, Avignon,France, March 25{29 1996. Springer-Verlag.Full technical report available at http://www-genome.wi.mit.edu/informatics/informatics papers/bibliography.html.[23] O. Bukhres and E. Kueshn, Eds. Special issueon software support for work ow management.Distributed and Parallel Databases|An Inter-national Journal, 3(2), April 1995.[24] A.K. Chandra and D. Harel. Computablequeries for relational databases. Journal of Com-puter and System Sciences, 21(2):156{178, 1980.[25] A.K. Chandra and D. Harel. Structure and com-plexity of relational queries. Journal of Com-puter and System Sciences, 25(1):99{128, 1982.[26] A.K. Chandra, D. Kozen, and L.J. Stockmeyer.Alternation. Journal of ACM, 28:114{133, 1981.[27] S. Christensen, Y. Hirshfeld, and F. Moller. De-cidable subsets of CCS. The Computer Journal,37(4):233{242, 1994. Special issue on process al-gebra.[28] Edmund M. Clarke, E. Allen Emerson, andA. Prasad Sistla. Automatic veri�cation of�nite-state concurrent systems using tempo-ral logic speci�cations. In ACM Transac-tions on Programming Languages and Systems(TOPLAS), pages 244{263, 1986.[29] Communications of ACM, 34(11), November1991. Special issue on the Human GenomeProject.[30] H. Davulcu, M. Kifer, C.R. Ramakrishnan, andI.V. Ramakrishnan. Logic based modeling andanalysis of work ows. In ACM Symposium onPrinciples of Database Systems, pages 25{33,Seattle, Washington, June 1998.[31] U. Dayal and Q. Chen. From database pro-gramming to business process programming.In Proceedings of the International Workshopon Database Programming Languages, Gubbio,Umbria, Italy, September 1995. Springer-Verlag.[32] U. Dayal, H. Garcia-Molina, M. Hsu, B. Kao,and M.-C. Shan. Third generation TP monitors:A database challenge. In ACM SIGMOD Con-ference on Management of Data, pages 393{397,Washington, DD, May 1993.[33] J. Desel. Free Choice Petri Nets. CambridgeUniversity Press, 1995.

[34] D. Drusinsky and D. Harel. On the power ofbounded concurrency i: Finite automata. Jour-nal of ACM, 41(3):517{539, 1994.[35] A.K. Elmagarmid, editor. Database Transac-tion Models for Advanced Applications. Morgan-Kaufmann, San Mateo, CA, 1992.[36] E.A. Emerson. Temporal and modal logic.In Handbook of Theoretical Computer Science,pages 997{1072. Elsevier, 1990.[37] D. Georgakopoulos, M. Hornick, and A. Sheth.An overview of work ow management: Fromprocess modeling to infrastructure for automa-tion. Journal on Distributed and ParallelDatabase Systems, 3(2):119{153, April 1995.[38] J. Gray and A. Reuter. Transaction Processing:Concepts and Techniques. Morgan Kaufmann,San Mateo, CA, 1993.[39] D. Harel. \StateCharts: A Visual Formalism forComplex Systems". Science of Computer Pro-gramming, 8:231{274, 1987.[40] D. Harel. A thesis for bounded concurrency. InProceedings of the 14th Symposium on Mathe-matical Foundations of Computer Science, num-ber 379 in Lecture Notes in Computer Science,pages 35{48. Springer-Verlag, 1989.[41] D. Harel, O. Kupferman, and M.Y. Vardi. Onthe complexity of verifying concurrent transitionsystems. Technical Report CS97-01, Faculty ofMathematical Sciences, The Weizmann Instituteof Science, Rehovot, Israel, 1997.[42] D. Harel, R. Rosner, and M. Vardi. On thepower of bounded concurrency iii: Reasoningabout programs. In Int'l Symposium on Logic inComputer Science, pages 478{488. IEEE Press,New York, 1990.[43] M. Hennessy. An Algebraic Theory of Processes.MIT Press, Cambridge, MA, 1988.[44] T. Hirst and D. Harel. On the power of boundedconcurrency ii: Pushdown automata. Journal ofACM, 41(3):540{554, 1994.[45] C.A.R. Hoare. Communicating Sequential Pro-cesses. Prentice Hall, Englewood Cli�s, NJ,1985.[46] J.E. Hopcroft and J.D. Ullman. Introduction toAutomata Theory, Languages and Computation.Addison-Wesley, Reading, MA, 1979.

[47] Special issue on work ow and extended transac-tion systems. Bulletin of the Technical Commit-tee on Data Engineering (IEEE Computer Soci-ety), 16(2), June 1993.[48] Special issue on work ow systems. Bulletin ofthe Technical Committee on Data Engineering(IEEE Computer Society), 18(1), March 1995.[49] Samuel Y.K. Hung. Implementation and Perfor-mance of Transaction Logic in Prolog. Master'sthesis, Department of Computer Science, Uni-versity of Toronto, 1996.[50] S. Jajodia and L. Kerschberg, editors. AdvancedTransaction Models and Architectures. KluwerAcademic Publishers, 1997.[51] K. Jensen. Colored Petri Nets. Springer-Verlag,Berlin, 1996.[52] K. Jensen and G. Rozenberg, editors. High LevelPetri Nets. Springer-Verlag, Berlin, 1991.[53] N.D. Jones, L.H. Landweber, and Y.E. Lien.Complexity of some problems in Petri nets. The-oretical Computer Science, 4:277{299, 1977.[54] Setrag Khosha�an and Marek Buckiewicz. In-troduction to Groupware, Work ow, and Work-group Computing. John Wiley & Sons, Inc.,1995.[55] R.P. Kurshan. Computer Aided Veri�cation ofCoordinating Processes. Princeton UniversityPress, 1994.[56] R. Milner. Communication and Concurrency.Prentice Hall, 1989.[57] R. Milner, J. Parrow, and D. Walker. A calculusof mobile processes, I. Information and Compu-tation, 100(1):1{40, September 1992.[58] W.F. Ogden, W.E. Riddle, and W.C. Rounds.Complexity of expressions allowing concurrency.In ACM Symposium on Principles of Program-ming Languages, pages 185{194, 1978.[59] C.H. Papadimitriou.Computational Complexity.Addison-Wesley, 1994.[60] W. Reisig. Petri Nets: An Introduction.Springer-Verlag, Berlin, 1985.[61] M. Rusinkiewicz and A. Sheth. Speci�cationand execution of transactional work ows. InW. Kim, editor, Modern Database Systems:The Object Model, Interoperability, and Beyond.Addison-Wesley, 1994.

[62] E. Shapiro. A family of concurrent logic pro-gramming languages. ACM Computing Surveys,21(3), 1989.[63] Amalia Sleghel. Implementation of ConcurrentTransaction Logic. Master's thesis, Departmentof Computer Science, University of Toronto. Inpreparation.[64] L. Stein, S. Rozen, and N. Goodman. Managinglaboratory work ow with LabBase. In Proceed-ings of the 1994 Conference on Computers inMedicine (CompMed94). World Scienti�c Pub-lishing Company, 1995.[65] D. Taubner. Finite Representations of CCS andTCSP Programs by Automata and Petri Nets.Number 369 in Lecture Notes in Computer Sci-ence. Springer-Verlag, 1989.[66] Dirk Wodtke and Gerhard Weikum. A formalfoundation for distributed work ow executionbased on state charts. In Int'l Conference onDatabase Theory, pages 230{246, 1997.[67] D.Worah and A. Sheth. Transactions in transac-tional work ows. In [50], chapter 1, pages 3{45.Kluwer Academic Publishers, 1997.