Abstractions of data types

33
Acta Informatica (2006) DOI 10.1007/s00236-006-0010-3 ORIGINAL ARTICLE Ferucio Laurent ¸iu T ¸ iplea · Constantin Enea Abstractions of data types Received: 7 July 2004 / Revised: 10 January 2006 C Springer-Verlag 2006 Abstract The use of abstraction in the context of abstract data types, is investi- gated. Properties to be checked are formulas in a first order logic under Kleene’s 3-valued interpretation. Abstractions are defined as pairs consisting of a congru- ence and a predicate interpretation. Three types of abstractions are considered, ∀∀, ∀∃, and 0,1 , and for each of them corresponding property preservation results are established. An abstraction refinement property is also obtained. It shows how one can pass from an existing abstraction to a (less) finer one. Finally, equationally specified abstractions in the context of equationally specified abstract data types are discussed and exemplified. Keywords Data type · Universal algebra · Verification · Abstraction 1 Introduction and preliminaries In the last decade a lot of progress has been made in the development of au- tomated verification tools for formal verification, such as PVS [35], HOL [18], STeP [39], COSPAN [24], MURPHI [12], SMV [4], and SPIN [20]. They have On leave from the Department of Computer Science, “Al. I. Cuza” University, Ias ¸i 740083, Romania The research reported in this paper was partially supported by the program ECO-NET 08112WJ/2004-2005 and by the National University Research Council of Romania, grants CNCSIS 632(28)/2004 and CNCSIS 632(50)/2005. F. L. T ¸ iplea (B ) School of Electrical Engineering and Computer Science, University of Central Florida, Orlando, FL 32816, USA E-mail: fl[email protected] C. Enea Department of Computer Science, “Al. I. Cuza” University, Ias ¸i 740083, Romania E-mail: [email protected] Acta Informatica, volume 42, numbers 8-9, April 2006, 639 - 671

Transcript of Abstractions of data types

Acta Informatica (2006)DOI 10.1007/s00236-006-0010-3

ORIGINAL ARTICLE

Ferucio Laurentiu Tiplea∗ · Constantin Enea

Abstractions of data types

Received: 7 July 2004 / Revised: 10 January 2006C© Springer-Verlag 2006

Abstract The use of abstraction in the context of abstract data types, is investi-gated. Properties to be checked are formulas in a first order logic under Kleene’s3-valued interpretation. Abstractions are defined as pairs consisting of a congru-ence and a predicate interpretation. Three types of abstractions are considered, ∀∀,∀∃, and ∃0,1∀, and for each of them corresponding property preservation resultsare established. An abstraction refinement property is also obtained. It shows howone can pass from an existing abstraction to a (less) finer one. Finally, equationallyspecified abstractions in the context of equationally specified abstract data typesare discussed and exemplified.

Keywords Data type · Universal algebra · Verification · Abstraction

1 Introduction and preliminaries

In the last decade a lot of progress has been made in the development of au-tomated verification tools for formal verification, such as PVS [35], HOL [18],STeP [39], COSPAN [24], MURPHI [12], SMV [4], and SPIN [20]. They have

∗ On leave from the Department of Computer Science, “Al. I. Cuza” University, Iasi 740083,Romania

The research reported in this paper was partially supported by the program ECO-NET08112WJ/2004-2005 and by the National University Research Council of Romania, grantsCNCSIS 632(28)/2004 and CNCSIS 632(50)/2005.

F. L. Tiplea (B)School of Electrical Engineering and Computer Science, University of Central Florida,Orlando, FL 32816, USAE-mail: [email protected]

C. EneaDepartment of Computer Science, “Al. I. Cuza” University, Iasi 740083, RomaniaE-mail: [email protected]

Acta Informatica, volume 42, numbers 8-9, April 2006, 639 - 671

F. L. Tiplea, C. Enea

been successfully applied to analyze the correctness of a large variety of reactivesystems, ranging from circuit designs to communication protocols [6, 28, 33]. Al-though verification tools as those given above are currently in use by companieslike Intel, Motorola, Siemens etc., industrial-size systems are still further to besatisfactorily checked. Traditional software testing is already a notoriously hard,time-consuming and expensive process, and stochastic systems, security protocolsand web site testing present even greater challenges. All these show that additionalresearch is needed to develop a number of new techniques that should enable largerhardware systems and certain types of software systems such as security protocolsand probabilistic programs to be verified.

Two such techniques, generally recognized as the only methods can ever scaleup to handle industrial-size design and verification, are the abstraction and mod-ularization which break the task of verifying a large system into several smallertasks of verifying simpler systems. Modularization exploits the modular structureof a complex system composed of multiple processes running in parallel. Abstrac-tion techniques, often based on abstract interpretation [8], provide a method forsymbolically executing systems using the abstract instead of the concrete domain.Familiar data-flow analysis algorithms are examples of abstract interpretation. Inparticular, abstract interpretation can be used in order to build the abstract state-space of the system.

Many abstraction techniques have been proposed in the literature [1, 2, 5, 9,11, 15, 17, 21–23, 26, 29, 34, 36–38, 40–43] and many of them deal with datatype reduction, that is the reduction of a large (possible unbounded) data type toa small one. For example, shape analysis [32, 38], which is a data flow analy-sis technique used mainly for complex analysis of dynamically allocated data, isbased on representing the set of possible memory states (“stores”) that arise at agiven point in the program by shape graphs. In such a graph, heap cells are rep-resented by shape-graph nodes and, in particular, sets of “indistinguishable” heapcells are represented by a single shape-graph node. Predicate abstraction is an-other prominent abstraction technique [11, 17, 42, 43]. The main idea of predicateabstraction is to map concrete objects (states of a transition system, data of a datatype etc.) to “abstract objects” according to their evaluation under a finite set ofpredicates. In [3], Bidoit and Boisseau consider algebraic abstractions in order toverify properties of security protocols modeled by universal algebras. Their ab-straction is based on homomorphisms, and the technique of duplicating predicatesymbols [7, 10] is used for validation and refutation. Ehrig and Kreowski [15]provides a well-written survey on refinement and modularization techniques inthe context of algebraic specifications.

In order that an abstraction be useful, it must be property-preserving. Threeforms of property preservation are mostly studied in the literature [9, 43]:weak, strong, and error preservation. An abstraction is weakly (strongly, error,respectively) preserving if a set of properties that are true (with truth-values trueor false, false, respectively) in the abstract system has corresponding propertiesin the concrete system that are also true (with the same truth-values, false,respectively).

Contribution and paper organization Many abstraction techniques we have foundin the literature are driven by almost the same mechanism (e.g., surjective func-tions) and based on similar property preservation results, but the formalisms used

Abstractions of data types

by authors are quite different. We may say that all these abstraction techniqueshave their roots in Cousot’s abstract interpretation framework [8], but this frame-work offers only a general methodology which, in particular cases should be com-plemented by specific techniques. Therefore, in our opinion, the development ofspecialized abstraction formalisms to allow reasonable instantiations in practicalcases, is necessary.

This paper is an attempt to provide a solution to this problem with respect todata types. It tries to capture the essence of data type reduction and, in order todo that (abstract) data types are modeled by universal algebras enriched by sets ofpredicate symbols. This is a widely accepted formalism for specifying data typeswhich offers mathematical precision and, on the other side, is practical related inthat that many modern programming languages, such as C++ and Java, allow theusers to define abstract data types beyond the basic ones. We define an abstractionas being a pair consisting of a congruence and a predicate interpretation. The con-gruence partitions the original data type and redefines its operations in order tooperate properly on the quotient data type. The predicate interpretation interpretsthe predicate symbols into the quotient data type. It is shown that the predicateinterpretation cannot be substituted by the congruence as in the case of the oper-ations. Therefore, a data type abstraction should necessarily include a predicateinterpretation function.

The definition of an abstraction we adopted has proved to be very suitablefrom many points of view. First, we were able to classify abstractions based on theproperty preservation they assure. Thus, ∀∀-abstractions are strongly preserving,∀∃-abstractions are weakly preserving, and ∃0,1∀-abstractions are error preserv-ing. This classification shows clearly that the property preservation an abstractionassures depends directly on the predicate interpretation and indirectly on the con-gruence. Secondly, the abstraction technique proposed in the paper generalizes andclarifies the nature of many abstraction techniques found in the literature, such asthe technique of duplicating predicate symbols [3, 7, 10], shape analysis [32, 38],predicate abstraction [11, 17, 42, 43], McMillan’s approach [29] etc. For example,it is shown that the technique of duplicating predicate symbols, which is based onassociating two versions to each formula, one used for validation and the other oneused for refutation, consists of two abstractions. One of them is a ∀∃-abstractionand the other one is an ∃0,1∀-abstraction, and both of them are based on the samecongruence. The ∀∃-abstraction is used in conjunction with validation formulas(because these abstractions are weakly preserving), and the ∃0,1∀-abstraction isused in conjunction with refutation formulas (because these abstractions are errorpreserving). Therefore, the nature of this technique is clearly emphasized. It con-sists of two abstractions based on the same congruence: one of them is weaklypreserving (used for validation), and the other one is error preserving (used forrefutation).

The abstraction technique proposed in the paper scale well from data typesto abstract data types. Here, abstractions are applied to initial specifications bymeans of equations. The result of such an abstraction is a specification of a quo-tient abstract data type, which is in fact a new abstract data type. Therefore, analy-sis techniques specific to abstract data types can be combined with the abstractiontechnique proposed in this paper and applied in order to reason on (quotient) ab-stract data types.

F. L. Tiplea, C. Enea

The paper is organized as follows. The rest of this section recalls basic con-cepts on many-sorted algebras. The second section provides a very brief intro-duction to (abstract) data types, motivates the usefulness of the universal algebraapparatus in their study, and fixes the logic to be used in order to reason aboutdata types. Abstractions of data types are introduced in Sect. 3, where propertypreservation results and comparisons with other abstraction techniques are alsoprovided. Thus, it is shown that the technique of duplicating predicate symbols,shape analysis, predicate abstraction and McMillan’s approach are all particularcases of our approach (from the abstraction’s point of view).

The last section extends the abstraction technique discussed in the previoussection to abstract data types. Consistent examples, such as the keeping-up pro-gram [28] and the bakery algorithm [25], are discussed.

Preliminaries on many-sorted algebras We recall a few concepts regarding many-sorted universal algebras (for details the reader is referred to [13, 14, 27, 30, 31]).

For a given non-empty set A, A∗ stands for the free monoid generated by A, λis the unity of A∗, and A+ = A∗ − {λ}.

Let S be a set of sorts, that is a non-empty set. An S-sorted set is an S-indexedfamily of sets A = (As |s ∈ S). For an S-sorted set A = (As |s ∈ S) and w ∈ S∗we denote by Aw the set

– Aw = {∅}, if w = λ;– Aw = As1 × · · · × Asn , if w = s1 . . . sn ∈ Sn , n ≥ 1.

The basic set-theoretic operations between S-sorted sets are pointwise defined. Forexample, if A and B are two such sets, then A

⋂B denotes the family (As ∩Bs |s ∈

S). By A ⊆ B we understand the pointwise inclusion, and A ⊂ B is the strictinclusion. As usual we let a ∈ A denote the membership of a to As , for somes ∈ S.

An S-sorted signature is an (S∗ × S)-sorted set of pairwise disjoint sets

� = (�w,s |(w, s) ∈ S∗ × S).

The elements (w, s) ∈ S∗ × S are called types over S, and the elements σ ∈ �w,sare called function or operation symbols of type (w, s); the elements σ ∈ �λ,sare also called constant symbols of sort s ∈ S (the others being sometimes callednon-constant symbols).

An S-sorted �-algebra (or algebra, for short) is a pair A = (A, �A) consist-ing of an S-sorted set A and an (S∗ × S)-sorted set of operations

�A = (�A

w,s |(w, s) ∈ S∗ × S),

where �Aw,s = {σ A|σ ∈ �w,s} such that σ A is a function from Aw into As , for all

(w, s) ∈ S∗ × S and σ ∈ �w,s . Denote by Alg� the class of all �-algebras.An S-sorted function from A into B is an S-sorted set of functions

h = (hs | s ∈ S)

such that hs is a function from As into Bs for all s ∈ S. h is called injective(surjective, bijective, respectively) if each component is an injective (surjective,bijective, respectively) function. The composition between S-sorted functions iscomponentwise defined. A homomorphism from an algebra A into an algebra B isa function h : A → B such that:

Abstractions of data types

– hs(σA) = σ B , for any s ∈ S and constant symbol σ of sort s;

– hs(σA(a1, . . . , an)) = σ B(hs1(a1), . . . , hsn (an)), for any non-constant symbol

σ ∈ �s1...sn ,s and ai ∈ Asi , 1 ≤ i ≤ n.

An S-sorted binary relation on A is an S-sorted set

ρ = (ρs |s ∈ S)

such that ρs is a binary relation on As , for all s ∈ S. ρ is an equivalence relationon A if each ρs is an equivalence relation on As . An equivalence relation ρ on Ais called a congruence on A if

σ A(a1, . . . , an) ρs σ B(b1, . . . , bn),

for any non-constant symbol σ ∈ �s1...sn ,s and ai , bi ∈ Asi such that ai ρsi bi forall i . That is, ρ is a congruence if it is an equivalence relation on A compatiblewith the algebra’s operations.

Given an algebra A and a congruence ρ on A, the quotient of A by ρ is thealgebra A/ρ = (A/ρ,�A/ρ) defined by A/ρ = (As/ρs |s ∈ S) and

– σ A/ρ = [σ A]ρs , for any sort s and constant symbol σ of sort s;– σ A/ρ([a1]ρs1

, . . . , [an]ρsn) = [σ A(a1, . . . , an)]ρs , for any non-constant symbol

σ ∈ �s1...sn ,s and ai ∈ Asi , 1 ≤ i ≤ n.

One can easily prove that these operations are well-defined.Assume that with each S-sorted signature � a disjoint S-sorted set X =

(Xs |s ∈ S) of variables is associated. Moreover, we assume that(

s∈S

Xs

)

∩⋃

(w,s)∈S∗×S

�w,s = ∅.

Define inductively the set of terms over � and X as follows:

– any variable and constant symbol of sort s is a term of sort s;– if ti is a term of sort si , 1 ≤ i ≤ n, and σ is a function symbol of type

(s1 . . . sn, s), then σ(t1, . . . , tn) is a term of sort s.

Denote by T�,X,s the set of all terms of sort s. The S-sorted set T�,X =(T�,X,s |s ∈ S) can be structured as a �-algebra, denoted T�,X , by considering

– σ T�,X = σ , for any constant symbol σ , and– σ T�,X (t1, . . . , tn) = σ(t1, . . . , tn), for any non-constant symbol σ ∈ �s1...sn ,s

and terms ti of sort si , 1 ≤ i ≤ n.

A term without variables is called a ground term. The S-sorted set T� of groundterms can be structured as a �-algebra as well, denoted by T� . This is an initialalgebra in the class Alg� of all �-algebras, that is there exists an unique homo-morphism evalA from T� into A, for each A ∈ Alg� .

An assignment of X into an algebra A is an S-sorted set of functions γ =(γs |s ∈ S) such that γs is a function from Xs into As , for all s ∈ S. If a ∈ Asand x ∈ Xs for some s ∈ S, then γ [x/a] is the assignment obtained from γ byreplacing the value γs(x) by a. Denote by �(X, A) the set of all assignments of

F. L. Tiplea, C. Enea

X into A. Each assignment γ into A can be extended to a unique homomorphismfrom T�,X into A; it will be denoted by γ too.

A �-equation of sort s is a pair (t, t ′) of terms of sort s; it is usually denotedby t = t ′. We say that an equation t = t ′ of sort s is valid in the algebra A, orthat the algebra A is a model of t = t ′, denoted A |= t = t ′, if γs(t) = γs(t ′) forall assignments γ into A. This concept is naturally extended to (S-sorted) sets ofequations. The class of all �-algebras that are models for a set E of equations isdenoted by Mod�(E).

Given a class C of �-algebras, denote by =C the binary relation on T� givenby

t =C t ′ ⇔ (∀A ∈ C)(evalA(t) = evalA(t ′)).=C is a congruence on T� . When C = Mod�(E), this congruence is usuallydenoted by =E and the quotient algebra T�/=E is denoted by T�,E . This algebrabelongs to the class Mod�(E), being an initial algebra in this class.

Convention In order to simplify the notation we shall write f (a) instead of fs(a)and [a] instead of [a]ρs , whenever the sort s is understood from the context ( fbeing an S-sorted function, and ρ an S-sorted congruence).

2 Reasoning about data types

We model data types by universal algebras, a widely accepted formalism for spec-ifying data types [13, 14, 27, 31]. To address the problem of verifying or analyzinga particular program that uses a certain data structure, we enrich signatures withlogical symbols used to build formulas. Questions about properties of data struc-tures will be answered by evaluating such formulas.

(Abstract) data types In this section we provide a brief introduction to (abstract)data types and motivate the usefulness of the universal algebra apparatus in theirstudy. We follow mainly [13, 14, 27, 31].

A data type consists of one or more sets of values, such as natural numbers,booleans, characters or strings, together with a collection of functions on thesesets. Examples of basic data types provided by most programming languages in-clude integer, boolean, array, record. From a mathematical point of view, a datatype is an S-sorted �-algebra. The signature � associates names to operations,while the algebra associates domains to sorts and interprets correspondingly theoperation names.

Many modern programming languages, such as C++ or Java, allow the userto define additional data types beyond the basic ones, such as stack, queue, treeor counter. As it is usual in such cases, operations are defined by equations. Forexample, if we define a data type of stacks over a set element of arbitrary elements(stacks are sequences of elements; the empty stack is the empty sequence λ) withthe operations Push of type (stack element, stack), Pop of type (stack, stack), andTop of type (stack, element), then the following equational axioms specify theproperties of these operations:1

1 This is a crude example used only to illustrate the need for specifying operations in modernprogramming languages by equations. It does not take into consideration the treatment of partialfunctions, under specification etc. just because this is not the goal of the paper.

Abstractions of data types

1. Pop(λ) = λ;2. Pop(Push(x, y)) = x ;3. T op(λ) = error ;4. T op(Push(x, y)) = y,

where x is a variable of type stack and y is a variable of type element. Suchdata types are usually called abstract data types. In fact, a specification as theone above stands for all data types of stacks over equipotent2 sets of elements.These data types are “similar” in that they differ only by the “nature” of theirelements. From a mathematical point of view, an abstract data type is a class of�-algebras closed under isomorphism (the isomorphism models the “similarity”concept above). More about abstract data types and specifications is provided inSect. 4.

There are many advantages which follow from the utilization of the universalalgebra apparatus in modeling (abstract) data types, such as:

– mathematical precision;– independence of any particular implementation in a computer language. There-

fore, we may reason about effects of the operations, relations to other data typesetc.;

– easiness in defining new operations and predicates on data types. For example,from the axioms above, one may easily define the predicate IsEmpty(x) by thefollowing additional axioms:– IsEmpty(λ) = true;– IsEmpty(Push(x, y)) = false;

– easiness in checking program correctness. If a program using a set of specifieddata types is designed so that the correctness of the program depends only onthe specification, then the primary concern of the data type implementor is tosatisfy the specification. In this way, neither the user nor the implementor of adata type needs to worry about additional details of the other’s program.

Abstract data types are central to object-oriented programming where every“class” is an abstract data type. An “object” is a data structure encapsulated with aset of routines, called “methods,” which operate on the data. Operations on the datacan only be performed via these methods, which are common to all objects thatare instances of a particular class. Thus the interface to objects is well defined, andallows the code implementing the methods to be changed as long as the interfaceremains the same.

A motivating example We consider a toy example in order to motivate the con-cepts we are going to introduce. More developed examples are provided later inthe paper.

Example 1 The set of natural numbers together with the addition operation definesa data type that can be modeled by an S-sorted �-algebra A, where:

– S contains only one sort (denoted by nat);– � contains one constant symbol of type nat for each natural number, and one

operation symbol + of type (nat nat, nat);

2 Two sets are equipotent if there exists a bijective function from one set into the other one.

F. L. Tiplea, C. Enea

– Anat = N;– the constant symbol associated to a natural number is interpreted as the number

itself, and +A is the usual addition operation on natural numbers.

In this algebra, the following property trivially holds:

(ϕ1) (∀x, y ∈ Anat )(I sgrz A(x) ∨ I sgrz A(y) ⇒ I sgrz A(x +A y)),

where I sgrz is a unary predicate symbol whose meaning is “is greater than zero,”I sgrz A is its interpretation in A, ‘∨’ and ‘⇒’ are the usual “or” and “implies”predicates, x and y are variables, and ‘∀’ is the universal quantifier. These newelements do not belong to the algebra A; they are part of a meta-language used toexpress properties of A.

Assume now that we are interested in distinguishing between 0 and the othernatural numbers. That is, we want to treat all the natural numbers greater than 0as a whole. In order to do that we define a congruence ρ by

a ρ b iff either a = b = 0 or a, b �= 0,

for all natural numbers a and b. The equivalence classes induced by ρ are [0], con-taining only the number 0, and [1], containing all the other numbers. The additionoperation on these equivalence classes is given in the table below.

+A/ρ [0] [1][0] [0] [1][1] [1] [1]

The algebra A/ρ acts as an abstraction of A with respect to the property mentionedabove (all the natural numbers greater than 0 are treated as a whole). The predicatesymbol I sgrz can be interpreted in A/ρ by

I sgrz A/ρ([a]) iff (∀a′ ∈ [a])(a′ > 0),

for any a. Now, the property ϕ1 can be rewritten as follows:

(ϕ2) (∀x, y ∈ Anat/ρ)(I sgrz A/ρ(x) ∨ I sgrz A/ρ(y) ⇒I sgrz A/ρ(x +A/ρ y)).

We have to remark that the validity of ϕ2 in A/ρ leads to the validity of ϕ1 inA. Moreover, ϕ2 holds in A/ρ, and this can be easily checked out by an automaticprocedure due to the fact that A/ρ contains only two elements. Therefore, ϕ1 holdsin A. Much more, ϕ1 and ϕ2 are in fact interpretations of the same formula ϕ (ofthe meta-language)

(ϕ) (∀x, y)(I sgrz(x) ∨ I sgrz(y) ⇒ I sgrz(x + y))

in two different algebras of the same signature. We can say that the validity of ϕin A/ρ implies the validity of ϕ in A.

Our example works fine with the predicates we considered. However, for otherpredicates things can be totally different. Let us consider, for instance, the equalitypredicate on A, denoted eq A. It can be interpreted in A/ρ in various ways. One

Abstractions of data types

of the most natural way is to consider a new truth value ⊥ whose meaning is“indefinite,” and to define the predicate as in the table below.

eq A/ρ [0] [1][0] 1 0[1] 0 ⊥

(eq A/ρ([1], [1]) is evaluated to ⊥ because two arbitrary numbers in [1] can beequal or different). As we can see, eq A/ρ is not the equality predicate on A/ρbecause it does not contain the identity.

Logically extended signatures The example in the paragraph above leads us to thefollowing considerations:

1. the meta-language used to express properties of data types (algebras) should bespecific to signatures and not to data types (algebras);

2. data type reductions can be captured by congruences. In such a case, the opera-tions are automatically redefined to operate on the quotient data type (algebra),but the predicates need a special treatment (more arguments about this are pro-vided in Sect. 3.1).

We will discuss (1) in this section, and (2) in the next section.

Let S be a set of sorts. A logical type over S is any nonempty word w ∈S+. Logical S-sorted signatures are defined as S-sorted signatures are, but withthe difference that they contain only logical symbols (predicate symbols). Thesymbols of such a signature have two roles:

– to specify basic properties satisfied by elements of an algebra;– to build formulas defining new properties.

Definition 1 Let S be a non-empty set. A logically extended S-sorted signatureis a pair (�, �L), where � is an S-sorted signature and �L is a logical S-sortedsignature.

We will use Kleene’s 3-valued interpretation of the propositional operators[16]. This interpretation is based on a set {0, 1, ⊥} of truth values, an informationorder �, and a logical order. 0 and 1 denote the classical definite values, and ⊥denotes an indefinite value. The information order is given by

x � y ⇔ x = y ∨ y = ⊥,

for all x, y ∈ {0, 1, ⊥}, and the logical order orders is 0, ⊥, 1. The interpretation ofthe propositional operators is based on the logical order by considering, as usual,x ∨ y = max{x, y} and x ∧ y = min{x, y}, for any truth values x and y. Thecorresponding truth tables are given in Fig. 1.

Fig. 1 Kleene’s 3-valued interpretation of the propositional operators

F. L. Tiplea, C. Enea

Definition 2 A (�, �L)-algebra, where (�, �L) is a logically extended signa-ture, is a 3-tuple A = (A, �A, �A

L ), where:

1. (A, �A) is a �-algebra;2. �A

L is defined as �A but with the difference that pA is a function from Aw into{0, 1,⊥}, for all p ∈ (�L)w.

The class of all (�, �L)-algebras is denoted by Alg�,�L . Given a logical sig-nature �L and a (denumerable) set X of S-typed variables (as in Sect. 1), defineinductively the set of first order formulas over (�, �L) and X as follows:

1. atomic formulas:(a) p(t1, . . . , tn) is an atomic formula, for any logical symbol p of type

s1 . . . sn and any term ti of sort si , 1 ≤ i ≤ n;2. formulas:

(a) every atomic formula is a formula;(b) if α and β are formulas, then (α ∨ β), ¬α, and (α ∧ β) are formulas;(c) if x is a variable and α is a formula, then ((∃x)α) and ((∀x)α) are formulas.

Denote by L(�, �L , X) the set of first order formulas over (�, �L) and X ,and let L+(�, �L , X) be the subset of formulas ϕ which do not contain the nega-tion operator.

Given a (�, �L)-algebra A, each term or first order formula ϕ induces a func-tion IA(ϕ) from the set �(X, A) (of all assignments of X into A, which are definedas for �-algebras) into A ∪ {0, 1,⊥}, as follows:

– IA(x)(γ ) = γ (x), for any variable x ;– IA(σ )(γ ) = σ A, for any constant symbol σ ;– IA(σ (t1, . . . , tn))(γ ) = σ A(IA(t1)(γ ), . . . , IA(tn)(γ )), for any term σ(t1,

. . . , tn);– IA(p(t1, . . . , tn))(γ ) = pA(IA(t1)(γ ), . . . , IA(tn)(γ )), for any atomic for-

mula p(t1, . . . , tn);– (α ∨ β), ¬α, and (α ∧ β) are interpreted in accordance with the interpretation

of the propositional operators on {0, 1,⊥}. For instance,

IA(¬α)(γ ) =

0, IA(α)(γ ) = 1

1, IA(α)(γ ) = 0

⊥, IA(α)(γ ) = ⊥

– IA((∃x)α)(γ ) =

1, if (∃a)(IA(α)(γ [x/a]) = 1)

0, if (∀a)(IA(α)(γ [x/a]) = 0)

⊥, otherwise

IA((∀x)α)(γ ) =

1, if (∀a)(IA(α)(γ [x/a]) = 1)

0, if (∃a)(IA(α)(γ [x/a]) = 0)

⊥, otherwisefor any formula α (a and x have the same sort).

IA(ϕ) is called the interpretation function of ϕ into A.

Definition 3 A formula ϕ is called valid or true in a (�, �L)-algebra A, denotedA |= ϕ, if IA(ϕ)(γ ) = 1 for all assignments γ : X → A. If ϕ is not true in A,then it is called false in A.

Abstractions of data types

Example 2 The algebra in Example 1 can be simply transformed into a (�, �L)-algebra by considering �L = {I sgrz}. Denote this algebra by A as well. Thefunction γ given by γ (x) = 0 and γ (y) = 1 is an assignment of X intoA. Under this assignment, IA(ϕ) is valuated to 1, where ϕ is the formula inExample 1.

3 Abstractions of models

An algebra assigns a meaning to a signature by associating a set of data to eachsort and an operation (function) to each operation symbol. Therefore, an algebradefines a concrete data type.

By a data type reduction/abstraction we understand to reduce types/domainsof large or unbounded range to small types/domains (types/domains of smallrange). A few words about “small domains” are in order. First of all, we have tosay that it is very hard to quantify the difference between large and small domainsand, moreover, the difference is relative. In 1981, when model checking was in-vented, systems with 1020 states were regarded as large systems. Nowadays, thereare techniques that can be applied to systems with 10120 states. If for a class ofsystems verification techniques can be applied in reasonable time, the systems inthe class can be regarded as small. Large systems are not necessarily infinite sys-tems. Surely, infinite systems pose more problems than finite ones, although thereare techniques for verifying particular infinite-state systems. As a conclusion, asystem which allows practical verification can be regarded as a small one. In ourpaper, all the examples we are going to consider exhibit reductions to (finite) verysmall domains. However, we want to emphasize that the main goal of the paperis not to investigate abstractions leading to small domains. The main goal of thepaper is twofold: first, we classify abstractions and show that many techniquesproposed in the literature fall in one of the classes introduced in the paper, andsecondly, we propose abstractions guided by equations.

As we have already mentioned in the previous section, an elegant way to for-malize such reductions is to use congruences which are equivalence relations com-patible with data type’s (algebra’s) operations. In this way, the operations of theoriginal data type are automatically redefined to operate on the abstraction of theoriginal data type.

In this section we study abstractions of data types; the extension to abstractdata types will be studied in the next section.

3.1 Abstractions

Congruences on (�, �L)-algebras are defined as for classical algebras by takinginto consideration the function symbols but not the logical ones. This is a natu-ral requirement because logical symbols are interpreted into the domain {0, 1, ⊥}which is not necessarily a carrier set (more about this is provided in Remark 1).

An abstraction of a (�, �L)-algebra A is defined as a quotient algebra of Aunder a congruence ρ together with a consistent definition of the logical operationsof this quotient algebra. We consider in this paper three types of abstractions.

F. L. Tiplea, C. Enea

Definition 4 Let A be a (�, �L)-algebra and ρ a congruence on A.

1. A (�, �L)-algebra B = (A/ρ,�A/ρ, �A/ρL ) is called a ∀∀-abstraction or

an abstraction of type ∀∀ of A by ρ if B is defined as the classical quotientalgebra of A by ρ but with the following definition for the logical operations:

– pA/ρ([a1], . . . , [an]) = b if (∀i)(∀a′i ∈ [ai ])(pA(a′

1, . . . , a′n) = b);

– pA/ρ([a1], . . . , [an]) = ⊥, otherwise,

for all p of type s1 . . . sn and b ∈ {0, 1} and a1 ∈ As1, . . . , an ∈ Asn .

2. A (�, �L)-algebra B = (A/ρ, �A/ρ, �A/ρL ) is called a ∀∃-abstraction or

an abstraction of type ∀∃ of A by ρ if B is defined as the classical quotientalgebra of A by ρ but with the following definition for the logical operations:

– pA/ρ([a1], . . . , [an]) = 1 if (∀i)(∀a′i ∈ [ai ])(pA(a′

1, . . . , a′n) = 1);

– pA/ρ([a1], . . . , [an]) = 0, if (∀i)(∃a′i ∈ [ai ])(pA(a′

1, . . . , a′n) = 0);

– pA/ρ([a1], . . . , [an]) = ⊥, otherwise,

for all p of type s1 . . . sn and a1 ∈ As1, . . . , an ∈ Asn .

3. A (�, �L)-algebra B = (A/ρ, �A/ρ, �A/ρL ) is called an ∃0,1∀-abstraction or

an abstraction of type ∃0,1∀ of A by ρ if B is defined as the classical quotientalgebra of A by ρ but with the following definition for the logical operations:

– pA/ρ([a1], . . . , [an]) = 1 if (∀i)(∀a′i ∈ [ai ])(pA(a′

1, . . . , a′n) ∈ {0, 1}) and

(∀i)(∃a′i ∈ [ai ])(pA(a′

1, . . . , a′n) = 1);

– pA/ρ([a1], . . . , [an]) = 0 if (∀i)(∀a′i ∈ [ai ])(pA(a′

1, . . . , a′n) = 0);

– pA/ρ([a1], . . . , [an]) = ⊥, otherwise,

for all p of type s1 . . . sn and a1 ∈ As1, . . . , an ∈ Asn .

The definition of the quotient algebra A/ρ in all the three cases in Definition 4is consistent in the sense that the definition of the operations does not depend onthe representatives. For the case of the function symbols one may consult [30]; thecase of the logical symbols follows directly from Definition 4.

The terminologies in Definition 4 come from the quantifiers used to define theabstractions. For example, ∀∃-abstractions are defined by using the quantifier ∀for the case of the truth value 1, and the quantifier ∃ for case of the truth valuefalse (except for “(∀i)” which is common to all the abstractions).

The table in Fig. 2 gives a comparative image about these abstractions. Forexample, if

⊥ ∈ {pA(a′1, . . . , a′

n)|(∀i)(a′i ∈ [ai ])}

Fig. 2 ∀∀−, ∃0,1∀−, and ∀∃−abstractions

Abstractions of data types

and{0, 1} ∩ {pA(a′

1, . . . , a′n)|(∀i)(a′

i ∈ [ai ])} �= ∅,

then pA/ρ([a1], . . . , [an]) is ⊥ for ∀∀-abstractions and ∃0,1∀-abstractions, but itcan be 0 or ⊥ for ∀∃-abstractions.

Example 3 Let A be the algebra in Example 2. The abstraction A/ρ in Example 1is a ∀∀-abstraction because the predicate I sgrz A/ρ is defined in this way.

Definition 5 Let A and B be two (�, �L)-algebras. A homomorphism from Ainto B is any function h : A → B such that

hs(σA(a1, . . . , an)) = σ B(hs1(a1), . . . , hsn (an))

andpA(b1, . . . , bm) � pB(hs′

1(b1), . . . , hs′

m(bm)),

for all σ ∈ �s1...sn ,s , ai ∈ Asi , p ∈ �s′1...s

′m

, and b j ∈ As′j, 1 ≤ i ≤ n, 1 ≤ j ≤ m.

We can easily remark that the definition of a homomorphism we adopted isan extension of the classical one to logical symbols. Since we did not include acarrier set for logical operations, we can not apply h to expressions beginning withlogical symbols.

An injective (surjective, bijective, respectively) homomorphism is called amonomorphism (epimorphism, isomorphism, respectively). We shall write A ∼= Bwhenever A and B are isomorphic.

Remark 1 As we can see, both the definition of the quotient algebra and the def-inition of a homomorphism requires a special treatment of the logical symbols.Including a carrier set for the range of the logical operations might appear as agood solution for a uniform treatment. However, this can generate severe prob-lems as the one described below.

Considering {0, 1,⊥} as a carrier (of some sort) in Example 1, the definitionof =A/ρ as a quotient operation (see Sect. 1) leads to

=A/ρ ([1], [1]) = [=A (1, 1)] = [1]b

and=A/ρ ([1], [1]) ==A/ρ ([1], [2]) = [=A (1, 2)] = [0]b,

where the index b specifies the fact that the equivalence classes are on the set{0, 1, ⊥} of truth values. In order the operation =A/ρ be well-defined, we have toconsider [0]b = [1]b. That is, both truth values 0 and 1 are in the same equivalenceclass.

Now, we have

=A/ρ ([0], [1]) = [=A (0, 1)] = [0]b

and=A/ρ ([0], [0]) = [=A (0, 0)] = [1]b = [0]b.

Therefore, by the predicate =A/ρ we cannot distinguish between [0] and [1],which is contrary to what we wanted to capture by the abstraction.

F. L. Tiplea, C. Enea

Proposition 1 Let A be a (�, �L)-algebra and ρ a ∀∀-abstraction of A. Then,the function h : A → A/ρ given by h(a) = [a], for all a ∈ A, is a surjectivehomomorphism from A into A/ρ.

Proof By definition, h is surjective. Following a classical line we can prove that hpreserves the algebra’s operations [27]. Concerning the logical symbols we have:

– if pA(a1, . . . , an) = b ∈ {0, 1}, then pA/ρ([a1], . . . , [an]) ∈ {b,⊥};– if pA(a1, . . . , an) = ⊥, then pA/ρ([a1], . . . , [an]) = ⊥.

Therefore, h is a surjective homomorphism from A into A/ρ. ��

In the view of Proposition 1, ∀∀-abstractions always lead to natural homo-morphisms between concrete data types and their abstractions. This propositioncannot be extended to the case of ∀∃-abstractions because pA/ρ([a1], . . . , [an])can be ⊥ or 0 when pA(a1, . . . , an) is ⊥. The proposition cannot be extended tothe case of ∃0,1∀-abstractions because pA/ρ([a1], . . . , [an]) can be ⊥, 0 or 1 whenpA(a1, . . . , an) is 0.

3.2 Property preservation

In order to be useful from a practical point of view, a data type reduction (abstrac-tion) should be conservative or property-preserving with respect to a specific setof properties. We mention here three forms of property preservation frequentlyfound in the literature [9, 43]:

– strong-preservation. An abstraction is strongly preserving if a set of propertieswith truth values true or false in the abstract system has corresponding proper-ties in the concrete system with the same truth values;

– weak-preservation. An abstraction is weakly preserving if a set of propertiestrue in the abstract system has corresponding properties in the concrete systemthat are also true;

– error-preservation. An abstraction is error preserving if a set of properties falsein the abstract system has corresponding properties in the concrete system thatare also false.

In this section we prove that the ∀∀-abstractions are strongly preserving withrespect to formulas in L(�, �L , X), the ∀∃-abstractions are weakly preservingwith respect to formulas in L+(�, �L , X), and the ∃0,1∀-abstractions are errorpreserving with respect to formulas in L+(�, �L , X).

We begin with some technical results. Given a congruence ρ on A and twoassignments γ ∈ �(X, A/ρ) and γ ′ ∈ �(X, A), we shall write γ ′ ∈ γ wheneverγ ′(x) ∈ γ (x), for all x ∈ X .

Lemma 1 Let ρ be a congruence on a (�, �L)-algebra A. Then:

(1) γ ′ ∈ �(X, A), for any γ ∈ �(X, A/ρ) such that γ ′ ∈ γ ;(2) For any γ ′ ∈ �(X, A) there exists γ ∈ �(X, A/ρ) such that γ ′ ∈ γ .

Proof (1) follows directly from definitions.(2) Given an assignment γ ′ into A, define the assignment γ into A/ρ by

γ (x) = [γ ′(x)], for all x . Clearly, γ ′ ∈ γ . ��

Abstractions of data types

Lemma 2 Let A be a (�, �L)-algebra, ρ a congruence on A, t a term, and γ anassignment into A/ρ. Then,

IA/ρ(t)(γ ) = [IA(t)(γ ′)],for any γ ′ ∈ γ .

Proof Let γ ′ : X → A be an assignment such that γ ′ ∈ γ . We shall prove thestatement in lemma by structural induction on the term t :

– t is a constant symbol σ . Then,

IA/ρ(t)(γ ) = σ A/ρ = [σ A] = [IA(t)(γ ′)].– t = x ∈ X . Then,

IA/ρ(t)(γ ) = γ (x) = [γ ′(x)] = [IA(t)(γ ′)](the second equality follows from the lemma’s hypothesis).

– t = σ(t1, . . . , tn). Assume the statement in lemma holds true for t1, . . . , tn .Then,

IA/ρ(t)(γ ) = σ A/ρ(IA/ρ(t1)(γ ), . . . , IA/ρ(tn)(γ ))

= σ A/ρ([IA(t1)(γ ′)], . . . , [IA(tn)(γ ′)])= [IA(t)(γ ′)].

��

Corollary 1 Let A be a (�, �L)-algebra, ρ a congruence on A, and γ an assign-ment into A/ρ. Then,

IA/ρ(p(t1, . . . , tn))(γ ) = pA/ρ([IA(t1)(γ1)], . . . , [IA(tn)(γn)]),for any γ1, . . . , γn ∈ γ and atomic formula p(t1, . . . , tn).

Proof From the definition of IA/ρ and Lemma 2. ��We have to remark that Lemma 2 and Corollary 1 do not depend on any defi-

nition of the logical operations in the quotient algebra. As a result, they hold truefor any abstraction.

Theorem 1 Let A be a (�, �L)-algebra, ρ a ∀∀-abstraction of A, and ϕ a for-mula. Then

IA/ρ(ϕ)(γ ) = b ⇒ (∀γ ′ ∈ γ )(IA(ϕ)(γ ′) = b),

for any b ∈ {0, 1} and γ ∈ �(X, A/ρ).

Proof We prove the property in theorem by structural induction on ϕ, for all as-signments γ into A/ρ.

Let ϕ = p(t1, . . . , tn). Assume IA/ρ(ϕ)(γ ) = b. By the definition of IA/ρ

and Corollary 1 we obtain

IA/ρ(p(t1, . . . , pn))(γ ) = pA/ρ(IA/ρ(t1)(γ ), . . . , IA/ρ(tn)(γ ))

= pA/ρ([IA(t1)(γ ′)], . . . , [IA(tn)(γ ′)])

F. L. Tiplea, C. Enea

for all γ ′ ∈ γ . Then, by the definition of the ∀∀-abstraction,

pA(IA(t1)(γ′), . . . , IA(tn)(γ

′)) = b,

for all γ ′ ∈ γ , which leads to

IA(p(t1, . . . , tn))(γ′) = b,

for all γ ′ ∈ γ .Let ϕ = ϕ1 ∨ ϕ2 and b = 0. Assume that the property in theorem holds true

for ϕ1 and ϕ2, and IA/ρ(ϕ)(γ ) = b. Then,

IA/ρ(ϕ1 ∨ ϕ2)(γ ) = 0 ⇒ IA/ρ(ϕ1)(γ ) = 0 = IA/ρ(ϕ2)(γ )

⇒ (∀γ ′, γ ′′ ∈ γ )(IA(ϕ1)(γ′) = 0 = IA(ϕ2)(γ

′′))⇒ (∀γ ′ ∈ γ )(IA(ϕ1 ∨ ϕ2)(γ

′) = 0).

Analogously, one can discuss the case b = 1 as well as the cases ϕ = ¬ϕ1 andϕ = α ∧ β.

Let ϕ = (∃x)ϕ1 and b = 0. Assume that the property in theorem holds truefor ϕ1, and IA/ρ(ϕ)(γ ) = b. Then,

IA/ρ((∃x)ϕ1)(γ ) = 0 ⇒ (∃a)(IA/ρ(ϕ1)(γ [x/[a]]) = 0)

⇒ (∃a)(∀γ ′ ∈ γ [x/[a]])(IA(ϕ1)(γ′) = 0)

⇒ (∃a)(∀γ ′ ∈ γ )(IA(ϕ1)(γ′[x/a]) = 0)

⇒ (∀γ ′ ∈ γ )(IA((∃x)ϕ1)(γ′) = 0)

(a and x have the same sort in all the implications above). Similarly, one candiscuss the case b = 1 as well as the case ϕ = (∀x)ϕ1. ��Corollary 2 Let A be a (�, �L)-algebra, ρ a ∀∀-abstraction of A, and ϕ a for-mula. Then,

(∀γ ∈ �(X, A/ρ))(∀γ ′ ∈ γ )(IA(ϕ)(γ ′) � IA/ρ(ϕ)(γ )).

Proof Let γ ∈ �(X, A/ρ) and γ ′ ∈ γ . If IA/ρ(ϕ)(γ ) = ⊥ then IA(ϕ)(γ ′) can beany truth value in {0, 1, ⊥}. The corollary follows then directly from the definitionof the relation � and Theorem 1. ��Corollary 3 ∀∀-abstractions of (�, �L)-algebras are strongly preserving withrespect to formulas in L(�, �L , X).

Proof Let A be a (�, �L)-algebra, ρ a ∀∀-abstraction of A, ϕ a formula, andb ∈ {0, 1}. Assume IA/ρ(ϕ)(γ ) = b, for all the assignments γ into A/ρ.

Let γ ′ : X → A be an assignment. From Lemma 1 it follows that there is anassignment γ into A/ρ such that γ ′ ∈ γ . Then, by Theorem 1, IA/ρ(ϕ)(γ ) = bimplies IA(ϕ)(γ ′) = b. Therefore, IA(ϕ)(γ ′) = b for all the assignments γ ′into A. This shows that ρ is strongly preserving with respect to formulas inL(�, �L , X). ��

Abstractions of data types

Theorem 2 Let A be a (�, �L)-algebra, ρ an abstraction of A, and ϕ a formulain L+(�, �L , X).

1. If ρ is a ∀∃-abstraction then

IA/ρ(ϕ)(γ ) = 1 ⇒ (∀γ ′ ∈ γ )(IA(ϕ)(γ ′) = 1),

for all γ ∈ �(X, A/ρ).2. If ρ is an ∃0,1∀-abstraction then

IA/ρ(ϕ)(γ ) = 0 ⇒ (∀γ ′ ∈ γ )(IA(ϕ)(γ ′) = 0),

for all γ ∈ �(X, A/ρ).

Proof Since the negation does not occur in ϕ, the proof of this theorem followsexactly the same line as the proof of Theorem 1. ��

Corollary 4 ∀∃-abstractions of (�, �L)-algebras are weakly preserving with re-spect to formulas in L+(�, �L , X).

Proof Let ρ be a ∀∃-abstraction of a (�, �L)-algebra A, and ϕ ∈ L+(�, �L , X)a formula that is true in A. Then, IA/ρ(ϕ)(γ ) = 1, for all the assignments γin A/ρ. By Theorem 2(1) and Lemma 1(2) we obtain IA(ϕ)(γ ′) = 1, for allthe assignments γ ′ in A. Therefore, ϕ is true in A, showing that ρ is weaklypreserving. ��

Corollary 5 ∃0,1∀-abstractions of (�, �L)-algebras are error preserving with re-spect to formulas in L+(�, �L , X).

Proof Let ρ be an ∃0,1∀-abstraction of a (�, �L)-algebra A, and ϕ ∈L+(�, �L , X) a formula that is false in A. Then, there is an assignment γ suchthat IA/ρ(ϕ)(γ ) = 0. By Theorem 2(2) we obtain IA(ϕ)(γ ′) = 0, for all γ ′ ∈ γ .Therefore, ϕ is false in A, showing that ρ is error preserving. ��

3.3 Example: McMillan’s approach

In [29], McMillan has proposed a kind of data type reduction (abstraction) tobe used with the SMV system. Even though his approach was “susceptible” tobe a particular case of Cousot’s abstract interpretation (see [29]), McMillan haspreferred to develop this new formalism closer to data types and practical appli-cations. This is not an isolated case and it proves the necessity of an intermediateformalism with a high degree of generality but easy to be applied in practice.

Our approach, proposed in the previous two sections, is as simply and elegantas general it is; it can be used to handle abstractions of data types defined in themost general way. This section, as well as the next three, will prove this.

F. L. Tiplea, C. Enea

McMillan’s approach We will use mainly the same notation as in [29]. Let U bea set of values, V a set of variables, T a set of types, and C a set of constructors(each of them having an arity). Define the set L of formulas as being the set ofground terms over C .

A structure is a triple M = (R,N ,F), where R : T → P(U ) assigns arange of values to every type, N is a set of denotations, and F is an interpretationassigning a function F(c) : N nc → N to each constructor c of arity nc. Thedenotation of a formula φ ∈ L in a structure M is defined inductively and denotedby φM . It is assumed that each structure M admits a pre-order ≤ on N and F(c)is a monotonic function with respect to ≤, for all constructors c.

A homomorphism from a structure M = (R,N ,F) into a structure M ′ =(R′,N ′,F ′) is a function h : N → N ′ satisfying

F ′(c)(h(t1), . . . , h(tn)) ≤′ h(F(c)(t1, . . . , tn)),

for all c(t1, . . . , tn) ∈ L (≤′ is the pre-order on N ′).By structural induction and using the monotonicity we can easily obtain

φM ′ ≤′ h(φM ), for all φ ∈ L and homomorphisms h from M into M ′.Assume now that each structure M contains a distinguished denotation true

(the denotation of valid formulas). A homomorphism h from M into M ′ is truthpreserving if true′ ≤′ h(x) implies x = true. If h is truth preserving, then

(∗) φM ′ = true′ ⇒ φM = true,

for all formulas φ.The methodology described above can be used in connection with data type

reductions as follows. Let us assume that M ′ is obtained from M by some abstrac-tion technique. If there is a truth preserving homomorphism h from M into M ′,then (∗) holds true. Therefore, proving that a formula φ holds in M can be reducedto proving that φ holds in M ′. This task could be easier because M ′ is a reducedstructure obtained from M .

The abstraction technique considered in [29] works as follows. Let M =(R,N ,F) be a structure, where

N = {I|I : {γ |γ : V → U } → U },and let r : T → P(U ) be a function such that r(s) ⊆ R(s), for all s ∈ T . Define

Rr (s) = {{a}|a ∈ r(s)} ∪ {R(s) − r(s)}and Nr as N but changing U into P(U ). The interpretation Fr is not defined inthe general case, but only in the case of the SMV operators, and it is asked thatevery constructor c should be safe with respect to r , that is

F(c(t1, . . . , tn))(γ ) ∈ Fr (c(t′1, . . . , t ′n))(γ ′),

for all t1, t ′1, . . . , tn, t ′n , γ : V → U and γ ′ : V → P(U ) such that γ ∈ γ ′ andF(ti )(γ ) ∈ Fr (t ′i )(γ ′), for all 1 ≤ i ≤ n.

Then, hr : M → Mr given by

hr (x)(γ ′) = {x(γ )|γ ∈ γ ′},for all x ∈ N and γ ′ : V → P(U ), is a truth preserving homomorphism from Minto Mr . Therefore, (∗) can be applied.

Abstractions of data types

A universal algebra approach The approach presented above is a particular caseof the results in Sect. 3.2. The set of types is just a set S of sorts. The set of con-structors used to define formulas is in our case the logically extended signature(�, �L). The set L(�, �L , X) of formulas we consider is incomparable muchlarger than the set of formulas considered by McMillan. Moreover, we consider asignature � used to define a concrete (and arbitrarily complex) data type. A struc-ture is then a (�, �L)-algebra A = (A, �A, �A

L ) which induces both assignmentsand formula interpretations.

A data type reduction r (in McMillan’s approach) induces an S-sorted equiva-lence ρ = (ρs |s ∈ S) given by

a ρs b ⇔ (a = b ∈ r(s)) ∨ (a, b �∈ r(s)),

for all a, b ∈ As .This equivalence should be in fact a congruence (this it is not mentioned ex-

plicitly in [29] but it can be easily seen from examples).Since the homomorphism h from M into Mr is truth-preserving, the structure

Mr becomes now a ∀∃-abstraction of A by ρ. The definition we consider for the∀∃-abstraction makes safe with respect to r all the function and logical symbols.Notice that there is no need to consider pre-orders on algebras.

Therefore, McMillan’s approach is a particular case of our approach.

3.4 Example: shape analysis

Shape analysis is a data flow analysis technique [32]. It is mainly used for complexanalysis of dynamically allocated data, and it is based on representing the set ofpossible memory states (“stores”) that arise at a given point in the program byshape graphs. In such a graph, heap cells are represented by shape-graph nodesand, in particular, sets of “indistinguishable” heap cells are represented by a singleshape-graph node. In the past two decades, many shape-analysis algorithms havebeen developed [1, 5, 21–23, 26, 34, 37, 40, 41]. The parametric framework forshape analysis introduced in [38] covers almost all of the work mentioned above.

We show that shape analysis as described in [38] can be obtained as a particularcase of the results in Sect. 3.2.

Shape analysis In [38], the authors define a first order logic with transitive clo-sure over a finite set P = {p1, . . . , pn} of predicate symbols. A 2-valued logicalstructure for this logic is defined as a couple S = (U S, IS), where U S is a setof individuals and IS maps each predicate symbol p of arity k to a truth-valuedfunction IS(p) : (U S)k → {0, 1}. Replacing the set {0, 1} by the set {0, 1,⊥} inthe definition above we obtain a 3-valued logical structure.

2-valued logical structures are used to encode concrete stores as follows. In-dividuals represent memory locations in the heap, pointers from the stack intothe heap are represented by unary predicates, and pointer valued-fields are repre-sented by binary predicates. The property-extraction principle adopted in [38] isthe following: by encoding stores as logical structures, questions about propertiesof stores can be answered by evaluating formulas. 3-valued logical structures areused to encode abstract stores.

F. L. Tiplea, C. Enea

The concrete store is related to the abstract store by truth-blurring embeddings.An embedding of a structure S = (U S, IS) into a structure S′ = (U S′

, IS′) is any

surjective function f : U S → U S′such that

IS(p)(u1, . . . , uk) � IS′(p)( f (u1), . . . , f (uk)),

for any k, predicate symbol p of arity k, and any u1, . . . , uk ∈ U S , where � is theinformation order on {0, 1, ⊥} (see Sect. 2).

Theorem 3 (Embedding Theorem [38])Let f be an embedding from a logical structure S = (U S, IS) into a logicalstructure S′ = (U S′

, IS′). Then,

IS(ϕ)(γ ) � IS′(ϕ)( f ◦ γ ),

for any formula ϕ and any complete assignment γ for ϕ.

The embedding theorem provides a systematic way to use an abstract 3-valuedstructure S to answer questions about properties of the concrete 2-valued structurethat S represents. It ensures that it is safe to evaluate a formula ϕ on a single 3-valued structure S instead of evaluating ϕ in all 2-valued structures that can beembedded in S.

An universal algebra approach The approach in [38] can be easily seen as a par-ticular case of the approach proposed in this paper. For each logical structureS = (U S, IS), U S can be viewed as a uni-sorted (�, �L)-algebra, where � isthe empty set and �L is a finite set of predicate symbols. IS is the interpretationfunction of formulas into the algebra U S (see Sect. 2).

The abstraction is driven by embeddings which are surjective functions. As� is empty, the equivalence relation induced by such a surjective function is acongruence. Also, the properties an embedding satisfy give rise to ∀∀-abstractionsand the embedding theorem follows easily from the results in Sect. 3.2. Moreover,in contrast to [38], the set of individuals in our approach is typed and enriched bytyped-operations; the predicate symbols are typed as well.

3.5 Example: predicate abstraction

Predicate abstraction, also called boolean abstraction or existential abstractionin [2], has been introduced by Graf and Saidi in [17] to provide a method for theautomatic construction of an abstract state graph of an infinite system using thePVS theorem prover. Since then, predicate abstraction has been studied thoroughly[11, 42, 43].

The main idea of the predicate abstraction is to map concrete objects (statesof a transition system, data of a data type etc.) to “abstract objects” according totheir evaluation under a finite set of predicates.

Let P be a finite set of predicates over a set A. The set P induces an equiva-lence relation ρP on A as follows

a ρP b ⇔ (∀p ∈ P)(p(a) iff p(b)),

Abstractions of data types

for all a, b ∈ A.The quotient set A/ρP can be taken as the abstract system induced by P .

If some operations are given on the set A (i.e., a transition relation, data typeoperations etc.) then these operations should be redefined to operate on the abstractsystem in a congruential way.

It is very clear that predicate abstraction is a particular case of the approachproposed in this paper.

3.6 Example: duplicating predicate symbols

The technique of duplicating predicate symbols is one of the intensively usedtechniques in abstraction [3, 7, 10]. It is based on associating “copies” to eachpredicate symbol. Therefore, a formula ϕ gets several versions depending on thepredicate copies are used. Usually, two copies for each predicate symbol are asso-ciated, and two versions of a formula are used: one of them for validation, and theother one for refutation.

We describe in details the duplicating predicate symbols technique used in[3] and we will show that it can be easily obtained as a particular case of theabstraction methodology proposed in this paper.

In [3], Bidoit and Boisseau considered logically extended structures whosepredicate symbols are valuated into {0, 1}, and associated a split signature(�, �L ,⊕ ∪ �L ,�) to each logically extended signature (�, �L), where �L ,⊕and �L ,� are obtained by indexing the predicate symbols P ∈ �L by ⊕ and,respectively, by �. P⊕ and P� have the same type as P .

Let A = (A, �A, �AL ) and B = (B, �B, �B

L ) be two (�, �L)-algebras and hbe an epimorphism from the �-algebra (A, �A) into the �-algebra (B, �B). Thecanonical (�, �L ,⊕ ∪ �L ,�)-structure associated to A and h is defined by

Ah = (B, �B, �BL ,⊕ ∪ �B

L ,�),

where:

– P Ah

⊕ (b1, . . . , bn) iff (∀1 ≤ i ≤ n)(∀ai ∈ h−1(bi ))(P A(a1, . . . , an));

– P Ah

� (b1, . . . , bn) iff (∀1 ≤ i ≤ n)(∃ai ∈ h−1(bi ))(P A(a1, . . . , an)),

for all P ∈ �L of type s1 . . . sn and b1 ∈ Bs1, . . . , bn ∈ Bsn .For any formula ϕ over (�, �L) and X define two formulas ϕ⊕ and ϕ� over

(�, �L ,⊕ ∪ �L ,�) and X as follows:

– P(t1, . . . , tn)⊕ = P⊕(t1, . . . , tn);– P(t1, . . . , tn)� = P�(t1, . . . , tn);– (ϕ1 ∨ ϕ2)⊕ = (ϕ1⊕ ∨ ϕ2⊕) and (ϕ1 ∨ ϕ2)� = (ϕ1� ∨ ϕ2�), and similar for ∧,

∀ and ∃;– (¬ϕ)⊕ = ¬(ϕ�) and (¬ϕ)� = ¬(ϕ⊕).

Now, one of the main results proved in [3] states that

Ah |= ϕ⊕ ⇒ A |= ϕ

andAh �|= ϕ� ⇒ A �|= ϕ.

F. L. Tiplea, C. Enea

These two implications give the correctness of the abstraction. Here, the abstrac-tion should be understood by the fact that an element b acts as an abstraction forh−1(b).

In order to show that the result above is a particular case of the abstractionmethodology proposed in this paper we have to remark first the following:

– the predicate symbols, in the formalism above, are valuated into {0, 1};– the kernel ker(h) of any epimorphism h defines a congruence on (A, �A) and

the quotient algebra (A/ker(h), �A/ker(h)) is isomorphic to B. Conversely,any congruence ρ on (A, �A) leads to an epimorphism from (A, �A) into(A/ρ, �A/ρ). Therefore, we can consider congruences instead of epimor-phisms in order to design abstractions;

– any formula ϕ ∈ L(�, �L , X) can be equivalently transformed into the nega-tion normal form, where the negation is applied only to atomic formulas.

Given a logically extended signature (�, �L) we define a new signature(�, �L ∪ �′

L), where �′L is just a copy of �L . In any (�, �L ∪ �′

L)-algebraA, the predicate symbol P ′ will be interpreted as ¬P is.

For any formula ϕ ∈ L(�, �L ∪�′L , X) in the negation normal form we define

a new formula ϕ′ ∈ L+(�, �L ∪ �′L , X) by replacing “¬P” by “P ′” and “¬Q′”

by “Q”, for any P and Q.Directly from the above constructions we have

(∗) A |= ϕ ⇔ A |= ϕ′,

for any formula ϕ ∈ L(�, �L , X). Consequently,

– if ρ is a ∀∃-abstraction of A, then

A/ρ |= ϕ′ ⇒ A |= ϕ

(from Corollary 4 and (∗));– if ρ is an ∃0,1∀-abstraction of A, then

A/ρ �|= ϕ′ ⇒ A �|= ϕ

(from Corollary 5 and (∗)).

ϕ′ plays exactly the role of ϕ⊕ in the first case (of ∀∃-abstractions), and it playsexactly the role of ϕ� in the second case (of ∃0,1∀-abstractions). In both cases,A/ρ plays the role of Ah .

These results emphasize clearly the nature of the duplicating predicate sym-bols technique. Thus, this technique consists of two abstractions based on the samecongruence. One of them is a ∀∃-abstraction, used for validation, and the other oneis an ∃0,1∀-abstraction, used for refutation.

3.7 Abstractions from abstractions

We might be interested in defining a reduced model starting from an already re-duced one. This implies in fact the construction of a quotient algebra starting froma quotient algebra and a congruence. Such a two-step reduction should be alsodefinable by a one-step reduction.

Abstractions of data types

Fig. 3 θ is finer than ρ

Definition 6 Let θ and ρ be two abstractions on a (�, �L)-algebra A. The ab-straction θ is finer than the abstraction ρ if θ ⊆ ρ and

pA/θ ([a1]θ , . . . , [an]θ ) � pA/ρ([a1]ρ, . . . , [an]ρ),

for all p of type s1 . . . sn and a1 ∈ As1, . . . , an ∈ Asn .

Figure 3 gives a pictorial view of the property “finer than” between abstrac-tions.

One can easily prove (see also [30]) that the binary relation ρ/θ given by

(ρ/θ)s = {([a]θs , [b]θs )|(a, b) ∈ ρs},for all sorts s, is a congruence on A/θ whenever θ and ρ are congruences on Asuch that θ ⊆ ρ.

Theorem 4 Let A be a (�, �L)-algebra, and θ and ρ abstractions of A of thesame type (e.g., both of them ∀∀, or ∀∃, or ∃0,1∀). If θ is finer than ρ then thereexists an abstraction δ of A/θ , of the same type as θ and ρ, such that

(A/θ)/δ ∼= A/ρ.

Proof Let θ and ρ be abstractions of A of the same type such that θ is finer thanρ. We consider the congruence δ given by ρ/θ , and extend it to an abstraction ofA/θ of the same type as θ and ρ. For example, if θ is an ∃0,1∀-abstraction of A,then

– p(A/θ)/δ ([[a1]θ ]δ, . . . , [[an]θ ]δ) = 1, if pA/θ ([b1]θ , . . . , [bn]θ ) ∈ {0, 1} for all[b1]θ ∈ [[a1]θ ]δ, . . . , [bn]θ ∈ [[an]θ ]δ and exists [bi ]θ ∈ [[ai ]θ ]δ for all 1 ≤i ≤ n such that pA/θ ([b1]θ , . . . , [bn]θ ) = 1;

– p(A/θ)/δ ([[a1]θ ]δ, . . . , [[an]θ ]δ) = 0, if pA/θ ([b1]θ , . . . , [bn]θ ) = 0 for all[b1]θ ∈ [[a1]θ ]δ, . . . , [bn]θ ∈ [[an]θ ]δ;

– p(A/θ)/δ ([[a1]θ ]δ, . . . , [[an]θ ]δ) = ⊥, otherwise.

Let h : (A/θ)/δ → A/ρ given by

h([[a]θ ]δ) = [a]ρ,

for all a ∈ A.

F. L. Tiplea, C. Enea

Following a classical line (for example, [30]) one can easily prove that h is abijective homomorphism of (�, �L)-algebras. Therefore, the following propertyremains to be proved:

p(A/θ)/δ ([[a1]θ ]δ, . . . , [[an]θ ]δ) � pA/ρ([a1]ρ, . . . , [an]ρ),

for all p of type s1 . . . sn and a1 ∈ As1, . . . , an ∈ Asn . But, this property canbe easily checked for each type of abstraction. For example, in the case of the∀∀-abstractions, we obtain

p(A/θ)/δ ([[a1]θ ]δ, . . . , [[an]θ ]δ) =

b, if pA/θ ([b1]θ , . . . , [bn]θ ) = bfor all [bi ]θ ∈ [[ai ]θ ]δ

⊥, otherwise

=

b, if pA(a′1, . . . , a′

n) = bfor all [bi ]θ ∈ [[ai ]θ ]δand a′

i ∈ [bi ]θ⊥, otherwise

=

b, if pA(a′1, . . . , a′

n) = bfor all a′

i ∈ [ai ]ρ⊥, otherwise

= pA/ρ([a1]ρ, . . . , [an]ρ),

for all p of type s1 . . . sn and a1 ∈ As1, . . . , an ∈ Asn , and b ∈ {0, 1}. The othercases follow a similar line. ��

Theorem 4 is an extension of the second homomorphism theorem for classicaluniversal algebras [30] to algebras defined by logically extended signatures. Itshows us that, in order to pass from an abstraction θ to an abstraction ρ one canabstract further A/θ by ρ/θ .

4 Abstractions of ADT

In this section we generalize the results in Sect. 3 to abstract data types. An ab-stract data type (ADT, for short) for a signature � is a class of �-algebras that isclosed under isomorphism.3 An ADT is called monomorphic if its algebras are allisomorphic to each other; otherwise, it is called polymorphic.4 A specification maybe viewed as a description of a class of objects by means of their properties. In aformal specification these properties are expressed as formulas in a logic. Hence,a specification of an ADT essentially consists of a set of formulas expressing thecommon properties of its algebras. Specifications are defined by a syntax and asemantics. The syntax fixes the “form,” and the semantics fixes the “meaning” ofspecifications. A specification is called monomorphic (polymorphic) if it defines a

3 Informally, the closure under isomorphism corresponds to the fact that isomorphic algebrasare “similar” in that they differ only by the nature of their carriers.

4 A monomorphic ADT stands for a single data type, whereas a polymorphic ADT may cor-respond to an “incomplete” specification.

Abstractions of data types

monomorphic (polymorphic) ADT. Specifications can be classified in atomic andcomposed. An atomic specification is essentially built up from the scratch; it con-sists of a signature � and a set of formulas in a logic L . Its semantics is definedas the class of all �-algebras that are models of . Three basic atomic specifica-tions are loose specification, initial specification, and constructive specification.5

A composed specification is a specification written in a specification language.Starting from atomic specifications the constructs of such a language allow one tobuild large specifications out of smaller ones.6

4.1 Abstractions of initial specifications

An initial specification is a couple Sp = (�, E), where � is a signature and E isa set of �-equations. The semantics of the specification Sp is the monomorphicADT

M(Sp) = {A|A ∼= T�,E }.An abstraction of a (�, �L)-algebra A consists in a congruence ρ and an in-

terpretation of the logical symbols in A/ρ . Naturally, abstractions can be appliedto abstract data types M(Sp) by means of a representative of them, and T�,E is asuitable choice. The only problem is that no logical operation is defined on T�,E .Therefore, what we have to do is to enrich specifications by logical symbols andto add logical operations to T�,E (or, in other words, to transform T�,E into a(�, �L)-algebra).

Definition 7 An initial logically extended specification is a 4-tuple Sp =(�, �L , E, �

T�,EL ), where:

1. (�, �L) is a logically extended signature;2. E is a set of �-equations;

3. �T�,EL is a set of logical operations on T�,E . It is defined as �T�,E but with

the difference that pT�,E is a function from (T�,E )w into {0, 1, ⊥}, for anyp ∈ (�L)w.

Given an initial logically extended specification Sp, denote by T�,�L ,E the

quotient algebra (T�,E , �T�,E , �T�,EL ).

Definition 8 Let Sp = (�, �L , E, �T�,EL ) be an initial logically extended speci-

fication. The semantics of Sp is given by

M(Sp) = {A|A ∈ Alg�,�L ∧ A ∼= T�,�L ,E }.

It is easily seen that T�,�L ,E is a model of the set E of equations. Moreover,this algebra is an initial algebra in the class M(Sp).

5 Drawing up atomic specifications is sometimes called specification-in-the-small.6 Drawing up composed specifications with the help of a specification language is sometimes

called specification-in-the-large.

F. L. Tiplea, C. Enea

Proposition 2 Let Sp = (�, �L , E, �T�,EL ) be an initial logically extended spec-

ification. Then, T�,�L ,E is an initial algebra in the class M(Sp).

Proof Let Sp′ = (�, E). For each algebra A = (A, �A, �AL ) ∈ M(Sp) define

A′ = (A, �A). The algebra T�,�L ,E contains all the operations of T�,E plussome logical operations on the carriers of T�,E .

By the definition of M(Sp), for every algebra A ∈ M(Sp), there exists anisomorphism h from T�,�L ,E into A. We know that T�,E is an initial algebra inM(Sp′) = {B|B ∈ Alg� ∧ B ∼= T�,E } [27], so h is also the unique homo-morphism from T�,E into A′ (A′ ∈ M(Sp′)). Therefore, h should be the uniquehomomorphism from T�,�L ,E into A. Hence, T�,�L ,E is initial in M(Sp). ��

If two (�, �L)-algebras A and B are isomorphic, then a formula ϕ is valid in Aiff it is valid in B. Let ρ be an abstraction of A and h an isomorphism from A intoB. The image of ρ by h is a congruence on B. It naturally leads to an abstractionof B of the same type as ρ. Moreover, A/ρ and B/h(ρ) are isomorphic. Thesesimple facts allow us to make abstractions only on T�,�L ,E whenever we deal

with specifications Sp = (�, �L , E, �T�,EL ).

When an abstraction is specified by a set of equations we say that it is anequationally specified abstraction.

4.2 Examples of equationally specified abstractions

In this section we give three examples of equationally specified abstractions. Firstof all we make some notational conventions. The specifications will begin withthe keyword LSpec, which is then followed by the name of the specification, dec-larations for sorts (sorts), operation symbols (opns), logical operation symbols(lopns), variables (vars), equations (eqns), and logical operations on the quotientterm algebra (leqns). All the operation symbols are declared with both their ar-ity, between “:” and “→”, and their sorts following the symbol “→”; for logicalsymbols we specify only the arity.

The elements (equivalence classes) of the quotient term algebra are denotedby [·]Q ; the distinction between logical operation symbols and logical operationsis made by marking all logical operations by the subscript “Q”.

An abstraction of a specification will begin with the keyword Abs of, which isthen followed by the name of the specification, variables (vars), equations definingthe abstraction (abs), and finally by the type (type) of the abstraction.

In all the examples below, Theorem 1 will be intensively used.

Abstracting natural numbers greater than 0 In Fig. 4, an initial logically extendedspecification of an elementary data type of natural numbers is given, where thepredicate I sgrz is like in Example 1.

Let Nat be the term algebra defined by the specification in Fig. 4, and NatQits quotient by =E , where E is the set of equations. Let us assume now that wewant to prove that the property

ϕ = (∀x, y)(I sgrz(x) ∨ I sgrz(y) ⇒ I sgrz(Add(x, y)))

Abstractions of data types

Fig. 4 Specification of an elementary data type of natural numbers

Fig. 5 An abstraction of Nat

Fig. 6 Program Keeping-up

holds in NatQ . The abstraction in Fig. 5 treats the number 0 as an individual (thatis, as [[Zero]Q], where the second pair of brackets stands for the equivalence classinduced by abstraction), and all the natural numbers greater than 0 on the whole(that is, as the equivalence class [[Succ(Zero)]Q]).

Therefore, the quotient algebra NatQ/ρ, where ρ is the congruence inducedby the abstraction in Fig. 5, has only two elements. It is straightforward to provethat ϕ holds in this algebra. Since the abstraction is of type ∀∀, we deduce that ϕholds true in NatQ .

The program keeping-up Consider the program Keeping-up [28] given in Fig. 6.It consists of two processes P1 and P2. The process P1 repeatedly incrementsx , provided that x does not exceed y + 1. Similarly, the process P2 repeatedlyincrements y, provided that y does not exceed x + 1. The program satisfies theglobal safety property �(|x − y| ≤ 1) [28].

In Fig. 7, an initial logically extended specification of this program is given.The sort vect (2) specifies a domain of 2-dimensional vectors of natural num-

bers. The elements of such a domain can be defined by a function symbolCons : nat nat → vect (2). However, for the sake of simplicity, we will pre-fer not to specify any such function symbol and to use the notation (x, y) insteadof Cons(x, y).

Conv stands for “conversion” of boolean data to natural numbers, Leq standsfor “less than or equal to”, and Add for “addition”. Trans gives the transition re-lation associated to this program. If (x, y) is obtained from (0, 0) by applying the

F. L. Tiplea, C. Enea

Fig. 7 Specification of the program in Fig. 6

Fig. 8 An abstraction of Keeping-up

transition relation finitely many times (in an obvious way), then we say that (x, y)is reachable.

The predicate GlobalSafety is used to model the global safety property wewant to prove. This property can be described by the formula

ϕ = (∀(x, y) reachable)(x = y ∨ x = Succ(y) ∨ y = Succ(x))

or, equivalently,

ϕ = (∀(x, y) reachable)(GlobalSa f ety((x, y))).

By means of the equivalence

(∀x, y)(|Succ(x) − Succ(y)| ≤ 1 ⇔ |x − y| ≤ 1)

we can derive the abstraction in Fig. 8. The abstraction leads to three equiv-alence classes on the set of all reachable vectors from the initial vector[[(Zero, Zero)]Q], namely

[[(Zero, Zero)]Q], [[(Succ(Zero), Zero)]Q], [[(Zero, Succ(Zero))]Q].

Abstractions of data types

Fig. 9 Bakery algorithm

Now, it is straightforward to check that ϕ holds true in the abstract system. Asthe abstraction is ∀∀, ϕ holds true in the original system.

The bakery algorithm We consider now the bakery algorithm as proposed byLamport in [25] as a solution to the mutual exclusion problem. The problem isas follows. Consider n asynchronous processors communicating with each othervia shared memory. Each processor runs a cyclic program consisting of two parts:a critical section and a non-critical section. The problem is to write a program sothat the following conditions are satisfied:

– at any time, at most one processor may be in its critical section;– each processor must eventually be able to enter its critical section (unless it

halts);– any processor may halt in its non-critical section.

No assumption can be made about the running speed of the processors.Lamport’s solution is based on the bakery algorithm in which a customer re-

ceives a number upon entering the store. The holder of the lowest number is thenext one served. If two customers receive the same number, then the one with thelowest name goes first.

In Fig. 9, a solution to this problem is described (only two processors areconsidered).

We mention that P1 and P2 may choose the same number, but this may happenonly when x = y = 0. Otherwise, they will have different numbers.

In Fig. 10, an initial logically extended specification of this program is given(in order to simplify the notation we will write 0 instead of Zero, 1 instead ofSucc(Zero), and 2 instead of Succ(Succ(Zero))). vect (5) denotes a sort for 5-dimensional vector of natural numbers representing global states of the system (aswith vect (2), no constructor will be provided for defining terms of this sort). Themeaning of a vector (x, x ′, y, y′, z) is the following:

– x and y are the numbers chosen by P1 and P2, respectively;– x ′ gives the local state of P1. x ′ is 0 if P1 did not choose yet any number, it is

1 if P1 has a number but it is not in its critical section, and it is 2 if P1 is in itscritical section. y′ has a similar meaning;

– z is a flag whose value is 0 when x = y, 1 when x > y, and 2 when x < y.

The initial state is (0, 0, 0, 0, 0). The transition relation is specified by three op-eration symbols, Trans for the initial case, Trans1 for P1, and Trans2 for P2. Astate is reachable from the initial state if there exists a sequence of steps leading tothe state, each step being performed by Trans or Trans1 or Trans2. The predicate

F. L. Tiplea, C. Enea

Fig. 10 Specification of the bakery algorithm

Fig. 11 An abstraction of the specification in 10

CriticalSection is true iff there exists a reachable state whose second and fourthcoordinates are 2.

The system satisfies the following safety property

ϕ = (∀(x, x ′, y, y′, z) reachable)(¬CriticalSection(x, x ′, y, y′, z)).

This property can be proved in an abstract system and then translated to the orig-inal system. An abstract system can be obtained in many ways. One of them isto use an equivalence relation driven by the values of x ′ and y′, and another oneis to use an equivalence relation driven by the variable z. The second case leadsto three equivalence classes in contrast with the first one which leads to 9 equiva-lence classes. The abstraction corresponding to the second case is given in Fig. 11.It is easily seen that this abstraction leads to three equivalence classes on the setof all reachable vectors, [[(1, 1, 0, 0, 1)]Q], [[(0, 0, 1, 1, 2)]Q], and

[[(0, 0, 0, 0, 0)]Q] = {[(0, 0, 0, 0, 0)]Q, [(1, 1, 1, 1, 0)]Q, [(1, 1, 2, 1, 2)]Q},and none of them contains a vector of the form [(x, 2, y, 2, z)]Q .

Therefore, ϕ holds true in the abstract system and, as a conclusion, it holdstrue in the original system.

Abstractions of data types

5 Conclusions

Abstraction techniques, often based on abstract interpretation [8], provide amethod for symbolically executing systems using the abstract instead of the con-crete domain. Familiar data-flow analysis algorithms are examples of abstract in-terpretation. In particular, abstract interpretation can be used for building the ab-stract state-space of the system. The abstraction is provided by the user and neednot be dependent on the choice of properties to verify.

Many abstraction techniques have been proposed in the literature and manyof them deal with data type reduction, that is the reduction of a large (possibleunbounded) data type to a small one. Prominent examples include shape analysis[32, 38] and predicate abstraction [17]. In order to be useful from a practical pointof view, a data type reduction (abstraction) should be conservative or property-preserving w.r.t. a specific set of properties.

In this paper we tried to capture the essence of data type reduction. In orderto do that we have modeled data types as universal algebras, and we enriched sig-natures with logical symbols used to build formulas. Answers to questions aboutproperties of data structures are obtained by evaluating such formulas. Then, anabstraction of a data type is just a quotient algebra together with a consistent def-inition of the logical operations of this quotient algebra. We have distinguishedbetween several types of abstractions and we have obtained property preservationresults for each case. As a conclusion we were able to provide a general formalismfor data type abstraction which can be considered as a umbrella for many abstrac-tion techniques known from the literature, such as predicate abstraction, shapeanalysis, the technique of duplicating predicate symbols, McMillan’s approach[29] etc.

In the last section we have taken into consideration equationally specified datatypes. Here, the abstraction is specified by a set of equations and it is applied tothe initial algebra of such specifications. Consistent examples, as the keeping-upprogram [28] and the bakery algorithm [25], have been discussed.

We believe that our formalism is quite elegant in the context of algebraic spec-ification of data types. The abstractions are introduced in a very natural way, andthe preservation results are discussed in detail. Our formalism is limited only bythe logic we have considered: a first order logic under a 3-valued interpretation.Extensions of this formalism to temporal logics are necessary, and this is one ofour future research topics.

Acknowledgements The authors want to thank Edmund Clarke for useful discussions regard-ing this paper while the first author was given a talk at Carnegie-Mellon University on the samesubject (http://www-2.cs.cmu.edu/ ˜ svc). We also want to thank two anonymous referees fortheir comments which led to the improvement of the paper presentation.

References

1. Assmann, U., Weinhardt, M.: Interprocedural Heap Analysis for Parallelizing ImperativePrograms. In: Giloi, W.K., Jahnichen, S., Shriver, B.D. (eds.) Programming Models forMassively Parallel Computers. IEEE Press, pp. 74–82 (1993)

2. Ball, Th., Podelski, A., Rajamani, S.K.: Boolean and Cartesian Abstraction for ModelChecking C Programs, Technical Report MSR-TR-2000-115, Microsoft Research (2000)

F. L. Tiplea, C. Enea

3. Bidoit, M., Boisseau A.: Algebraic Abstractions. In: 15th Workshop on Algebraic Devel-opment Techniques WADT’01, Lecture Notes in Computer Science 2267, 21–47 (2001)

4. Burch, J., Clarke, E., McMillan, K., Dill D.: Symbolic Model Checking: 1020 States andBeyond, In: Proceedings of the 5th Symposium on Logic in Computer Science (1990)

5. Chase, D., Wegman, M., Zadeck F.: Analysis of Pointers and Structures, In: SIGPLANConference on Programming Languages, Design and Implementation, pp. 296–310 (1990)

6. Clarke, E.M., Grumberg, O., Peled D.A.: Model Checking, MIT Press (2000)7. Clarke, E.M., Grumberg, O., Long, D.E.: Model Checking and Abstraction, ACM Trans-

actions on Programming Languages and Systems, pp. 1512–1542 (1994)8. Cousot, P., Cousot, R.: Abstract Interpretation: a Unified Lattice Model for Static Analysis

of Programs by Construction or Approximation of Fixpoints. In: 4th ACM Symposium onPrinciples of Programming Languages, pp. 238–252 (1977)

9. Dams, D.: Abstract Interpretation and Partial Refinement for Model Checking, Ph.D. The-sis, Technische Universitat Eindhoven (1996)

10. Dams, D., Gerth, R., Grumberg, O.: Abstract Interpretation of Reactive Systems, ACMTransactions on Programming Languages and Systems 19(2) (1997)

11. Das, S., Dill, D.L., Park S.: Experience with Predicate Abstraction. In: Proceedings of the11th International Conference on Computer Aided Verification CAV’99, Lecture Notes inComputer Science 1633, 160–171 (1999)

12. Dill, D.L., Drexler, A.J., Hu, A.J., Yang, C.H.: Protocol Verification as a Hardware DesignAid. In: Proceedings of the IEEE International Conference on Computer Design: VLSI inComputers and Processors, pp. 522–525 (1992)

13. Ehrig, H., Mahr, B.: Fundamentals of Algebraic Specification 1: Equations and Initial Se-mantics, Springer-Verlag (1985)

14. Ehrig, H., Mahr, B.: Fundamentals of Algebraic Specification 2: Module Specifications andConstraints, Springer-Verlag (1990)

15. Ehrig, H., Kreowski, H.-J.: Refinement and Implementation. In: Astesiano, E. et al. (eds.)Algebraic Foundations of Systems Specification, IFIP State-of-the-Art Report. Springer,pp. 201–242 (1999)

16. Ginsberg, M.: Multivalued Logics. A Uniform Approach to Inference in Artificial Intelli-gence, Computational Intelligence 4, 265–316 (1988)

17. Graf, S., Saidi, H.: Construction of Abstract State Graphs with PVS. In: Proceedings of the9th International Conference on Computer Aided Verification, Lecture Notes in ComputerScience 1254, 72–83 (1997)

18. The HOL System, Computer Laboratory, University of Cambridge, http://www.cl.cam.ac.uk/Research/HVG/HOL

19. Holzmann, G.J.: A Practical Method for Verifying Event-driven Software. In: Proceedingsof the 21st International Conference on Software Engineering ICSE’99, pp. 597–607 (1999)

20. Holzmann, G.J.: The SPIN Model Checker. Primer and Reference Manual, Addison-Wesley (2003)

21. Horwitz, S., Pfeiffer, P., Reps, T.: Dependence Analysis for Pointer Variables. In: SIG-PLAN Conference on Programming Languages, Design and Implementation, pp. 28–40(1989)

22. Jones, N.D., Muchnick, S.: Flow Analysis and Optimization of Lisp-like Structures.In: Muchnick, S., Jones, N.D. (eds.) Program Flow Analysis: Theory and Applications ,Prentice-Hall, pp. 102–131 (1981)

23. Jones, N.D., Muchnick, S.: A Flexible Approach to Interprocedural Data Flow Analysisand Programs with Recursive Data Structures. In: Symposium on Principles of ProgramingLanguages, pp. 66–74 (1982)

24. Kurshan, R.P.: Computer-Aided Verification of Coordinating Processes: The Automata-Theoretic Approach. Princeton University Press (1994)

25. Lamport, L.: A New Solution of the Dijkstra’s Concurrent Problem, Communications ofthe ACM 17(8), 453–455 (1974)

26. Larus, J., Hilfinger, P.: Detecting Conflicts Between Structure Accesses. In: SIGPLANConference on Programming Languages, Design and Implementation, pp. 21–34 (1988)

27. Loeckx, J., Ehrich, H.-D., Wolf, M.: Algebraic Specification of Abstract Data Types. In:Abramsky, S., Gabbay, D.M., Maibaum, T.S.E. (eds.) Handbook of Logic in ComputerScience , vol. 5, Clarendon Press, pp. 217–316 (2000)

Abstractions of data types

28. Manna, Z., Pnueli, A.: The Temporal logic of Reactive and Concurrent Systems. Specifi-cation, Springer-Verlag (1992)

29. McMillan, K.: Verification of Infinite State Systems by Compositional Model Checking,Research Report, Cadence Berkeley Labs (1999)

30. Meinke, K., Tucker, J.V.: Universal Algebra, In: Abramsky, S., Gabbay, D., Maibaum,T.S.E. (eds.) Handbook of Logic in Computer Science vol. 1, Oxford University Press,Oxford, pp. 189–411 (1993)

31. Mitchell, J.: Foundations of Programming Languages, The MIT Press (1996)32. Nielson, F., Nielson, H.R., Hankin, Ch.: Principles of Program Analysis, Springer-Verlag

(1999)33. Peled, D.A.: Software Reliability Methods, Springer-Verlag (2001)34. Plevyak, J., Chien, A., Karamcheti, V.: Analysis of Dynamic Structures for Efficient Par-

allel Execution. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D.: (eds.) Languagesand Compilers for Parallel Computing Lecture Notes in Computer Science 768, Springer-Verlag, pp. 37–57 (1993)

35. The PVS Specification and Verification System, Computer Science Laboratory, SRI Inter-national, http://pvs.csl.sri.com

36. Saidi, H.: Model Checking Guided Abstraction and Analysis, In: Proceedings of the 7thInternational Static Analysis Symposium (2000)

37. Sagiv, M., Reps, Th., Wilhelm, R.: Solving Shape-Analysis Problems in Languages withDestructive Updating, ACM Transaction on Programming Languages and Systems, 20(1),1–50 (1998)

38. Sagiv, M., Reps, Th., Wilhelm, R.: Parametric Shape Analysis via 3-Valued Logic, ACMTransaction on Programming Languages and Systems, 24(3), 217–298 (2002)

39. STeP: The Stanford Temporal Prover, http://www-step.stanford.edu40. Stransky, J.: A Lattice for Abstract Interpretation of Dynamic (Lisp-like) Structures, Infor-

mation and Computation 101(1), 70–102 (1992)41. Wang, E.Y.-B.: Analysis of Recursive Types in an Imperative Language, Ph.D. Thesis, Uni-

versity of California, Berkeley (1994)42. Viser, W., Park, S., Penix, J.: Using Predicate Abstraction to Reduce Object-oriented Pro-

grams for Model Checking. In: Proceedings of the 3rd ACM Workshop on Formal Methodsin Software Practice, Portland (Oregon), pp. 3–12 (2000)

43. Visser, W., Park, S., Penix, J., Oh, P.: Abstracting Object-Oriented Programs for ModelChecking, unpublished manuscript (2001)