A metamodeling language supporting subset and union properties

15
A Metamodeling Language Supporting Subset and Union Properties Marcus Alanen and Ivan Porres TUCS Turku Centre for Computer Science Department of Computer Science, Åbo Akademi University Lemminkäisenkatu 14, FIN-20520 Turku, Finland e-mail:{marcus.alanen,ivan.porres}@abo.fi Abstract. The Meta Object Facility (MOF) 2.0 and the Unified Modeling Lan- guage Infrastructure introduce new language features such as subsets, (derived) unions and redefinitions, but without a precise semantics. This is a problem since they are used throughout the definition of the Unified Modeling Language (UML) 2.0. We give our understanding of these new language features by formalizing the structural constraints imposed by subsets and unions on metamodels and models using Liskov substitutability as the main criterion. We expect that this article provides a better understanding of the foundations of the MOF 2.0, which is nec- essary to define model transformation languages and tools. 1 Introduction The OMG modeling standards are based on the concept of metamodeling. A metamodel is an artifact that defines the abstract syntax of a modeling language. A metamodeling language is then a computer language used to describe metamodels. The constructs most often found in metamodeling languages such as the Meta Ob- ject Facility 1.4 (MOF) [14], the Eclipse Modeling Framework EMF [10] and even the Graph eXchange Language Metaschema [20] are strikingly similar since they are all based to some extent on the object-oriented software paradigm. In these approaches, a metamodel is defined as a collection of classes and properties while a model is an in- stance of such classes and properties. However, the recent MOF 2.0 [15] and UML 2.0 Infrastructure [16] metamodeling languages introduce several new concepts, mainly: subset properties, derived union properties and property redefinitions. These concepts are useful to define a new modeling language as an extension of an existing one. Unfor- tunately, very little is told in [15, 16] about the actual meaning of these new constructs. This is a critical omission since these concepts are heavily used in the definition of the Unified Modeling Language 2.0 [17]. We consider that a precise definition of a metamodel is necessary in order to con- struct tools to edit, query and transform models. In this article, we present a set-theoretic nonmetacircular formalization of a metamodeling language that supports what we con- sider to be the core structural features of MOF 2.0 and the UML 2.0 Infrastructure, including the new subset properties. Although this article presents a theoretical frame- work, we believe it represents an important contribution that can influence the imple- mentation of model repositories and transformation tools for the UML 2.0 language.

Transcript of A metamodeling language supporting subset and union properties

A Metamodeling Language SupportingSubset and Union Properties

Marcus Alanen and Ivan Porres

TUCS Turku Centre for Computer ScienceDepartment of Computer Science,

Åbo Akademi UniversityLemminkäisenkatu 14, FIN-20520 Turku, Finlande-mail:{marcus.alanen,ivan.porres}@abo.fi

Abstract. The Meta Object Facility (MOF) 2.0 and the Unified Modeling Lan-guage Infrastructure introduce new language features such as subsets, (derived)unions and redefinitions, but without a precise semantics. This is a problem sincethey are used throughout the definition of the Unified Modeling Language (UML)2.0. We give our understanding of these new language features by formalizing thestructural constraints imposed by subsets and unions on metamodels and modelsusing Liskov substitutability as the main criterion. We expect that this articleprovides a better understanding of the foundations of the MOF 2.0, which is nec-essary to define model transformation languages and tools.

1 Introduction

The OMG modeling standards are based on the concept of metamodeling. A metamodelis an artifact that defines the abstract syntax of a modeling language. A metamodelinglanguage is then a computer language used to describe metamodels.

The constructs most often found in metamodeling languages such as the Meta Ob-ject Facility 1.4 (MOF) [14], the Eclipse Modeling Framework EMF [10] and even theGraph eXchange Language Metaschema [20] are strikingly similar since they are allbased to some extent on the object-oriented software paradigm. In these approaches, ametamodel is defined as a collection of classes and properties while a model is an in-stance of such classes and properties. However, the recent MOF 2.0 [15] and UML 2.0Infrastructure [16] metamodeling languages introduce several new concepts, mainly:subset properties, derived union properties and property redefinitions. These conceptsare useful to define a new modeling language as an extension of an existing one. Unfor-tunately, very little is told in [15, 16] about the actual meaning of these new constructs.This is a critical omission since these concepts are heavily used in the definition of theUnified Modeling Language 2.0 [17].

We consider that a precise definition of a metamodel is necessary in order to con-struct tools to edit, query and transform models. In this article, we present a set-theoreticnonmetacircular formalization of a metamodeling language that supports what we con-sider to be the core structural features of MOF 2.0 and the UML 2.0 Infrastructure,including the new subset properties. Although this article presents a theoretical frame-work, we believe it represents an important contribution that can influence the imple-mentation of model repositories and transformation tools for the UML 2.0 language.

2 MOF 2.0 and UML 2.0 Infrastructure as MetamodelingLanguages

The purpose of MOF 2.0 and the UML 2.0 Infrastructure is to define new modelinglanguages that can be used in software and system development. In this section, wepresent the main concepts in these core languages informally, focusing on new featureswith respect to MOF 1.x. We also discuss a common rationale behind these features.

2.1 Classes and Properties

The main concepts used to describe the abstract syntax of a modeling language areclasses and properties (also known as association ends). A class represents a concept ina modeling language such as a UML Use Case or a Transition in a Statechart, while aproperty represents a feature of such a concept such as the fact that a Use Case has aname or a Transition has an event trigger.

These classes and properties can be instantiated into a finite set of elements andslots that form a model. Each element, consisting of slots, has a class as its type andeach slot has a property as its type. A slot keeps a collection of other model elementsas its contents and the type of these elements and their amount is constrained by itsproperty. A property can optionally denote a composition of elements. An element canonly be placed in a single composition slot at a time, called its owner. Finally, a slotcan be ordered. In this case, the contents of a slot is represented by an ordered set. Anelement cannot occur twice in a slot.

The structure of a metamodel is often described visually using UML class diagrams,while models are represented using its own particular concrete syntax. However, we canalways represent a model as an object diagram. As an example, the left part of Figure 1shows a metamodel for a graph. This diagram shows two classes: Vertex and Edge, andfour properties: from, to, outgoing and incoming. The properties from and to belong toEdge and have a multiplicity constraint of 1, i.e., each element of type Edge should haveexactly one from Vertex and one to Vertex. The properties outgoing and incoming havea multiplicity of [0..∞ ], i.e., a Vertex can have any number of incoming and outgoingEdges.

D'AC'a bd{'sARm1 R}C{R,n{ *

,ncRm,n{ *CR1

V2:Vertex

V3:Vertex

E1:Edge

E2:Edge

V1:Vertex

outgoing

outgoingfrom

from incoming

to

to

incoming

Fig. 1. (Left) Metamodel for a Graph; (Right) Example Model

Each property has another property as its opposite. Together they define an associa-tion that is represented as a single line. In the example, we have the from-outgoing and

the to-incoming associations. At the model layer, this bidirectionality means that whena Vertex v has an Edge e in its outgoing slot, the Edge e will have Vertex v in its fromslot. For reasons of conceptual simplicity we ignore attributes (which would not havean opposite property) as found in for example MOF and ECORE.

We can depict models as object diagrams. In this case, each element is depicted asa rectangle, while the contents of each slot are represented as directed arcs betweennodes, labeled by the name of the slot. That is, we do not represent the slots in an objectdiagram but their contents. An example object diagram based on our metamodel forgraphs is shown in the right part of Fig. 1.

2.2 Language Extension

We have seen that classes and properties are the building blocks to define new modelinglanguages. However, we are often not interested in creating new modeling languages,but in extending an existing one. Let us assume that we wish to create a metamodelfor bipartite graphs based on our metamodel for general graphs. This is an example,adapted from [19], of the definition of a new language based on an existing language.An initial metamodel for bipartite graphs is shown in Figure 2. In this new language,there are two types of vertices, blue and red, and two types of edges, red-to-blue andblue-to-red.

D'ACabCd{Cs

RCmabCd{Cs

RCmD'ACa1m}CD'ACRCma1m}Caaaaaa,n*c�,n}DR�

aaaaa{cRCm �

aaaaaacA{}c,n}DR�

aaaaa�dc�D'AC

aaaaa�dc�RCm

aaaaaacA{}c,n}RD�

aaaaa{cD'AC

� aaaaaa,n*c�,n}RD

Fig. 2. Metamodel for a Bipartite Graph

The metamodel for a bipartite graph as it is shown in Fig. 2 cannot represent red-to-red or blue-to-blue edges, as we intended. However, this metamodel has no relationwith the metamodel for general graphs shown in Fig. 1. This means that programs andmodel transformations that traverse and extract information on general graphs based onthe initial metamodel will not work on bipartite graphs.

We consider that an important requirement for creating new extensions to an exist-ing modeling language is Liskov Substitutability [12]. Generally, this means that pro-grams, queries and transformations designed for a modeling language should work onmodels based on extensions of the original modeling language. As a consequence, alanguage extension should not be able to arbitrarily redefine or remove classes or prop-erties from a language.

MOF 2.0 proposes mainly three extension mechanisms for metamodels: class spe-cialization, property subsets and unions, and property redefinitions. Class specializationis similar to class inheritance in object-oriented languages. A specialized class inheritsall the properties of its base classes, and it can define new properties. Subset and unionproperties are a mechanism to define the relationship between the properties in a spe-cialized class and its base classes. Finally, property redefinition allows us to arbitrarilyreplace a property with another one.

We should note that MOF 2.0 and the UML 2.0 Infrastructure contain other conceptsto define language extensions such as packages, package merges and package imports.However, they do not influence the relationship between model elements. Therefore, westudy only the semantics of class specialization and subset properties since we considerthat these are the main novelties in MOF 2.0 and the core mechanism for languageextension.

In the rest of this section we discuss the new language extension mechanisms inmore detail.

Class Specialization and Property Subsetting Class specialization is represented di-agrammatically as an edge between the base class and the specialized class with a tri-angular arrow head pointing to the base class. Property subsets can be represented byadding a label “{ subsets X }” next to a property, where X is the property being subset-ted, or by connecting two associations with a specialization edge labeled “{ subsets }”.We can see an example of these two equivalent notations in Figure 3.

D

'A

C

a b

d{ sRbsm1s a }

,{ sRbsm1s b }

D

'A

C

abd{sdRmdb1

} s

bbbbbb, bbbbbbn

Fig. 3. Two Alternative Notations for Subsetting

The intuition behind the metamodel in Fig. 3 is as follows: An element of type Chas two slots that correspond to properties b and d. Elements of type B can be insertedinto slot b and elements of type D can be inserted into slots b and d. At any moment,the contents of the slot d should be a subset of the contents of the slot b; we say thatslot d is a subset of slot b.

We can use specialization and subset properties to create a new metamodel for a bi-partite graph for our running example. The classes Blue Vertex and Red Vertex will nowbe specializations of Vertex. The fromRed, toRed, fromBlue and toBlue properties willbecome subsets of the from and to properties, and similarly for incomingBR, incomin-gRB, outgoingBR and outgoingRB. The resulting metamodel is show in Figure 4 while

a model based on this metamodel is shown in Figure 5. The benefit is that e.g. graphtraversal algorithms which worked on the initial metamodel in Fig. 1 should still workfor bipartite graphs when using the metamodel in Fig. 4. Although similar functionalitycould be accomplished with virtual methods in Edge and Vertex, overridden in the sub-classes, subsets enable us to describe the same functionality in terms of data modelinginstead of operational terms. Besides this simple example, the reader can find numerousand rather complex examples of subset properties in the UML 2.0 Superstructure [17].

D'ACabCd{Cs

bCd{Cs

RCmabCd{Cs

RCmD'ACa1m}CD'ACRCma1m}C

1m}C,dn*c nA{}n��} �

���n*��}DR� �A��C{� ���n*��} �

{nRCm� �A��C{� {n �

c

nA{}n��}DR� �A��C{� nA{}n��} ��

,dn*D'AC� �A��C{� ,dn* �

c

���n*��} �{nc

,dn*RCm� �A��C{� ,dn* �

c

nA{}n��}RD� �A��C{� nA{}n��} �

{nD'AC� �A��C{� {n �

c ���n*��}RD� �A��C{� ���n*��} �

Fig. 4. Metamodel for a Bipartite Graph as an Extension of the Metamodel for a General Graph

V2:BlueVertex

V3:BlueVertex

E1:RedBlueEdge

E2:RedBlueEdge

V1:RedVertex

outgoingoutgoingRB

outgoingougoingRB

fromfromRed

fromfromRed incoming

incomingRB

totoBlue

totoBlue

incomingincomingRB

Fig. 5. Example Model for the Metamodel in Figure 4

Union Properties and Derived Unions The last extension mechanism presented inthe MOF 2.0 that we will discuss in this paper is union properties. In our approach, if aproperty has properties that subset it, it is a union property. It is not necessary to declare

a property as a union, since a designer of a metamodel cannot know in advance if a newsubset property will be defined in the future.

The UML 2.0 Infrastructure also introduced the concept of derived union. The stan-dard states on page 126 that “This means that the collection of values denoted by theproperty in some context is derived by being the strict union of all of the values denoted,in the same context, by properties defined to subset it. If the property has a multiplicityupper bound of 1, then this means that the values of all the subsets must be null or thesame.” In other words, a derived union property can be seen as the strict read-only unionof its subsets. A slot with a property that is a derived union cannot contain elements thatdo not appear in any of its subsets.

Another way to define derived content is to create an arbitrary query operation. Thishas been done in the Eclipse Modeling Framework using so called volatile attributes asexplained in [8]. This way, the contents of a slot are defined by evaluating the associatedquery. The drawback is that there is no strict mathematical relationship between thederived property and any other properties. The benefit is that it does not restrict themetamodel creator in any way.

2.3 Language Extension Issues

There are some issues with language extension that immediately need to be taken intoaccount.

Multiple Generalizations We should note that a metamodeling language should sup-port multiple inheritance since it is used extensively in MOF, as has been noticed bye.g. Anneke Kleppe [11]. Multiple inheritance forms very complicated inheritance hi-erarchies, among them the diamond inheritance structure. This leads to a possibilitywhere property subsetting also has a diamond (or even more complicated) structure. InFigure 6 an element of type D has four slots: a′, b′, c′ and d′, where d′ is a subset of b′and c′, while b′ and c′ are subsets of a′. Although this is easily taken into account fordefining the structural aspects of models and metamodels, it is problematic to define thesemantics of operations on e.g. slots, especially when properties are ordered. This hasbeen investigated more thoroughly in the technical report [2].

Subset and Union Properties in the Same Class Subset properties can be useful evenwhen they are not used in combination with class specialization. That is, we can definea property and its subsets in the same class.

Property Subsetting Follows Class Specialization Considering Figure 3 further, weinvestigate two fundamental cases in Figure 7. The first one depicted on the left side ofthe Figure questions the validity of the scenario where class C does not subclass A eventhough it has a property which subsets another property from A.

Let us take an element ec of type C and an element ed of type D. Modifying thed slot of ec, i.e, modifying ec.d (by inserting or removing ed), should modify ec.b aswell, due to subsetting. But the slot b does not exist in ec. This is a contradiction and

D' A'

C'

a'

D

a

C

A

b b'

d{ sRms1}s m, n *

d'{ sRms1}s m', n' *

n{ sRms1}s b *

n'{ sRms1}s b' *

m{ sRms1}s b *

m'{ sRms1}s b' *

Fig. 6. Diamond Inheritance and Diamond Subsetting

D

'A

C

a b

d{ sRbsm1s a }

,{ sRbsm1s b }

D

'A

C

a b

dddddd{ sR m1bm},m b n

Fig. 7. (Left) Subsetting without Generalization; (Right) Subsetting Only One Property of anAssociation

the conclusion must thus be that property subsetting “follows” class generalization. Inour example, either class C needs to be a subclass of class A or the subsetting has to beremoved.

The same remark is also stated in the UML 2.0 Infrastructure [16] on page 126: Aproperty may be marked as a subset of another, as long as every element in the contextof the subsetting property conforms to the corresponding element in the context of thesubsetted property.

The Opposite of a Subset Property Should be a Subset Finally, let us consider themetamodel on the right side of Figure 7. In this metamodel, the subset property c hasan opposite property a which is not a subset.

It can easily be seen that this idea is not sound. Let us take an element ec of type Cand an element ed of type D. Modifying ec.d necessarily modifies ed .c as well due tobidirectionality.1 Due to subsetting between properties b and d, ec.b is also modified.Then due to bidirectionality, ed .a is also modified and the final effect is as if c wouldsubset a. The conclusion is that we claim that c needs to subset a, if for nothing elsethan documentation purposes.

1 We have heard arguments against the bidirectionality assumption of slots. However, we willnot delve into this matter in this article for reasons of conceptual simplicity in the framework.

There are several inconsistencies in the metamodels for MOF 2.0 and UML 2.0where this rule is violated. Fortunately, the correction is simple by saying that (in ourexample) c needs to subset a.

2.4 Alternative Language Extension Mechanisms: Covariant Specialization andRedefinitions

There exists at least two other approach to language extension, covariant specializationand property redefinition.

As an example of covariant specialization, lets assume that ec is an element of type Cshown in Fig. 3, with subsetting replaced by covariant specialization. In such an envi-ronment, it is not possible to insert elements of type B into the b slot of ec, only elementsof type D into the d slot. The c-d association is a covariant specialization of the a-b as-sociation. The a-b association has been rendered obsolete (or read-only) in the contextof element ec.

Covariance is a subject that often comes up in the semantics of methods of object-oriented programming. Function parameter type contravariance and return type covari-ance are rather inconvenient in practical situations [9] and thus a type-unsafe functionparameter type covariance is used for specialization. A similar argument also holdsfor element slots. Property subsetting aims to provide a new way to represent relation-ships between elements. It must nevertheless be noted that as Giuseppe Castagna hasasserted [9], there are uses for a covariant environment when compared with a con-travariant or invariant environment. Thus subsetting and covariance are not opposingbut complementing constructs in object-oriented programming and thus in modeling.

The major difference between covariance and subsetting is that in a covariant en-vironment, substituting an element of a specific type with an element which is a spe-cialization of that type can result in programs no longer working. Subsetting allows theslots defined by the properties in a superclass to be used in an instance of a subclass.

Finally, MOF 2.0 also supports the concept of property redefinition. A property re-definition is an arbitrary replacement of the characteristics of a property in a subclassthat overrides the subsetted property and renders it unusable in the subclass. There ishowever no other clear relationship to the subsetting property; the redefinition could becovariant or contravariant in some characteristics of the property, and otherwise com-patible or incompatible in others. Using property redefinitions in a language extensionbreaks substitutability and therefore transformations and tools based on the originallanguage.

3 A Simple Metamodeling Language

Based on the discursion in the previous section, we can now present the definition ofa simple metamodeling language that contains the core concepts of MOF 2.0 and theUML 2.0 Infrastructure. Without loss of generality, we ignore classes that representprimitive datatypes such as integers, strings and enumeration values. They can be easilyadded, but we have to omit this description due to the page limit.

Our modeling framework acts as our metamodeling language. We represent allmetamodels as a tuple MM = (C,P,generalizations,properties,characteristics), whereC is a set of classes, P a set of properties and C∩P = /0. We define the generalizationsof a class with the function generalizations : C → P (C). The direct generalization re-lation is defined as g def= {(c1,c2) · c2 ∈ generalizations(c1)}. We require the directedgraph representing the class generalization relation (C,g) to be acyclic. Also, we de-note by ⊆c the extended generalization between classes that is defined as the reflexivetransitive closure of the generalization relation: ⊆c

def= g∗. The relation ⊆c is a partialorder since it is reflexive and transitive by definition and antisymmetric due to the factthat the generalization graph is acyclic.

Each property has a class as an owner, and this fact is indicated by the functionowner : P→C. What properties belong to a given class is given by the function properties :C → P (P) such that (∀c ∈ C · properties(c) = {p · c = owner(p)}). The effectiveproperties of a class are those defined by the class itself and transitively by any of itsgeneralizations.

Finally, the characteristics of a property represent constraints in the elements thatcan be placed in the slots corresponding to that property. In this paper, we definecharacteristics def= (lower,upper,opposite,ordered,composite,derived,supersets) as a tu-ple of functions detailing the properties further:

– lower : P→ Z0+ \∞ represents the lower multiplicity constraint of a property (0, 1,2, . . . , excluding infinity)

– upper : P→ Z+ represents the upper multiplicity constraint (1, 2, . . . , ∞).– opposite : P → P is a bijective function that yields the opposite of a property. The

opposite of a property cannot be itself: (∀p ∈ P · opposite(p) 6= p). A property isthe opposite of its opposite: (∀p ∈ P · opposite(opposite(p)) = p).

– ordered : P→ B is true if a property is ordered.– composite : P→ B is true if a property is composite.– derived : P→ B is true if a property is derived.– supersets : P→P (P) represents the set of properties of which a property is a subset.

The graph representing the property superset relation (P,{(p,q) · q∈ supersets(p)})must be acyclic.

We denote subsetting between properties by the⊆p relation, i.e.,⊆pdef= {(p,q) · q∈

supersets(p)}∗. We can write ⊆ instead of ⊆p or ⊆c without ambiguity, since one is arelation over classes whereas the other is a relation over properties. We also definea⊂ b def= a⊆ b∧a 6= b.

The expression s || t means¬(s⊆ t)∧¬(t ⊆ s), i.e., there is no order defined betweens and t. Finally, we denote by r�s that r is a direct subset of s, i.e., r�s def= r⊂ s∧¬(∃u ·r ⊂ u⊂ s).

Based on the discussion in the previous section, we should consider three addi-tional constraints over the structure of a metamodel. First, a property can subset an-other property only from the reflexive transitive superclass closure of its owner, i.e.,(∀p,q ∈ P · p ⊆p q ⇒ owner(p) ⊆c owner(q)). Given this it can be shown, followinga similar argument as for ⊆c, that ⊆p is a partial order. Also, the opposite of a subsetproperty should be a subset: (∀p,q ∈ P · p⊆p q⇒ opposite(p)⊆p opposite(q)).

We can now define a metamodel simply as any nonempty finite subset of the set ofclasses C, MMx ⊆C. Note that generalization of classes and the opposite of a propertycan be defined across several metamodels.

3.1 Models

We define a model as the tuple M = (E, type,slots,S,property, elements) where E isa finite set of elements and S is a finite set of slots. Each element in E has a typedefined by a class in a metamodel, type : E → C. Each slot has one element as itsowner and this fact is represented by the function slotowner : S → E. For convenience,we also define the slots of a given element with the function slots : E → P (S), where(∀e ∈ E · slots(e) def= {s · e = slotowner(s)}). Each slot corresponds to a property asdefined by the function property : S→ P.

There is a slot for each effective property in the type of an element. This notion iscaptured in the following two model constraints:

Constraint 1 Valid slots in element (1): (∀e ∈ E · (∀s ∈ S · s ∈ slots(e)⇒ type(e)⊆cowner(property(s))))

Constraint 2 Valid slots in element (2): (∀e ∈ E · (∀c ∈ C · type(e) ⊆c c ⇒ (∀p ∈properties(c) · (∃!s ∈ slots(e) · property(s) = p))))

Slots consist of elements. The function elements : S→ (E,≺) returns a total orderedset of elements of its argument slot s if ordered(property(s)) is true, otherwise elements :S→ P (E) returns an unordered set of elements. The slot thus describes the connectionfrom its owner element to the elements in the slot. There is no actual ordering definedbetween the elements in an ordered slot; they merely have an assigned position in s. Westress that no matter whether a slot is ordered or not, an element can never occur twicein it.

We need to define the relation between a slot and its subsets. The slot subsettingrelation is defined by ⊆s

def= {(r,s) · slotowner(r) = slotowner(s) ∧ property(r) ⊆pproperty(s)}∗. A slot can only contain elements of the type allowed by its property:

Constraint 3 Type correctness: (∀s ∈ S · (∀e ∈ elements(s) · type(e)⊆cowner(opposite(property(s)))))

Because each property has an opposite property, the elements in a slot owned by anelement e will each have a slot which contains e. This feature is called bidirectionalityand must never be violated. In other words, if e has a slot s which contains e′, then e′must also have a specific slot s′ which contains e.

Constraint 4 Slot bidirectionality: (∀s ∈ S · (∀e′ ∈ elements(s) · (∃!s′ ∈ S ·slotowner(s′)= e′∧opposite(property(s′))= property(s)∧slotowner(s)∈ elements(s′))))

We define the size of a slot to be the amount of elements in that slot: (∀s ∈ S ·#s def= #elements(s) ). The size of a slot is constrained by the multiplicity bounds of itsproperty:

Constraint 5 Multiplicity: (∀s ∈ S · lower(property(s))≤ #s ≤ upper(property(s)))

An element cannot be in two composite slots which exist in different partial orders,and the composition relation must be acyclic.

Constraint 6 Only in one composite slot : (∀e∈E · ¬(∃r,s · slotowner(r)= slotowner(s)∧(property(r) || property(s))∧ composite(property(r))∧ composite(property(s))∧e ∈ elements(r)∧ e ∈ elements(s))

Constraint 7 Composition is acyclic: (∀e1, . . . ,en,en+1 ∈ E,∃s1, . . . ,sn ∈ S · (∀i · 1≤i ≤ n ⇒ slotowner(si) = ei ∧ composite(property(si))∧ ei+1 ∈ elements(si)) ⇒ e1 6=en+1)

The contents of a slot s subsetting another slot t must be a subset of the contents of t.Also, MOF [15] tells us on page 59 that “The slot’s values are a subset of those for eachslot it subsets.” For ordered slots, we also wish to preserve order, i.e., when elementsoccur in a specific order in s, they should occur in the same order in t, although t mightcontain more elements in between. We denote a ≺x b if element a precedes element bin a specific ordered slot x.

The last three constraints are specific to derived, unordered and ordered slots withrespect to property subsetting. We call these constraints the inherent subsetting rules.

Constraint 8 Derived slots: (∀p ∈ P · derived(p) ⇒ (∀t ∈ S · property(t) = p ⇒elements(t)\S{elements(s) · s� t}= /0))

Constraint 9 Unordered slots: (∀r,s ∈ S · r ⊆s s∧¬ordered(s)⇒ elements(r)⊆elements(s))

Constraint 10 Ordered slots: (∀x,y ∈ E,r,s ∈ S · x ∈ elements(r)∧ y ∈ elements(r)∧x≺r y∧ r ⊆s s∧ordered(s)⇒ x ∈ elements(s)∧ y ∈ elements(s)∧ x≺s y)

4 Model Conformance to a Metamodel

In the previous section we have described the structure of models and metamodelsand we have defined a number of constraints that can be used to determine when amodel is valid in our modeling framework. These constrains can be implemented in ametamodel-independent modeling repository and work as an invariant that any modeltransformation should preserve.

We can now also define if a given model M = (E, type,slots,S,property,elements)is in the modeling language defined by a metamodel MMx. This question can be askedin two slightly different ways. First, we can ask whether M is a direct instance of MMxand not of an extension of MMx. In this case, M is an instance of MMx, if it is a validmodel and the type of each element in M is in MMx:

M is an instance of MMx if M is valid and (∀e ∈ E · type(e) ∈MMx)On the other hand, we can also allow M to be an instance of either MMx or an extensionof it. In this case, we say that M conforms to MMx if M is a valid model and the type ofeach element in M is a subclass of a class in MMx:

M conforms to MMx if M is valid and (∀e ∈ E · (∃c ∈MMx · type(e)⊆c c))

4.1 Inconsistent Metamodels

We have seen that different property characteristics such as multiplicity, compositionor subsetting allows us to define rich and extensible metamodels. However, we shouldconsider whether there are combinations of these characteristics that define inconsistentmetamodels, i.e., metamodels that contain one or more classes that cannot be instanti-ated in any valid model. To avoid this situation, we may define additional constraintsfor metamodels.

Multiplicities Since the number of elements in a slot is bounded by the multiplicitycharacteristics of its property, we require that the lower value in a multiplicity rangeshould be less than the upper value: (∀p ∈ P · lower(p) ≤ upper(p)). Otherwise, themetamodel is inconsistent.

Also, it can easily be seen that a property subsetting another property must have alower (or the same) upper limit than the other property. This can be formalized with:(∀p ∈ P · (∀q ∈ supersets(p) · upper(p) ≤ upper(q))). The justification for this con-straint can be seen with slots r and s such that r ⊂ s, property(r) = p, property(s) = q,upper(p) > upper(q) and by filling the slot r with elements so that #r = upper(p).Then #s ≥ #r = upper(p) > upper(q) ⇒ #s > upper(q), which violates the upperlimit of q.

There are no restrictions on the lower limits of the properties, since more elementscan always be inserted into a slot until its size is at least that of the greatest lowest limitin any transitive sub- or superset.

Composition In Figure 8, we see different cases with composite and noncompositeproperties. Cases (1) and (2) are quite self-explanatory. Case (3) can be considered legalby discounting the composition at the C-D association without any loss in information,since any elements owned via the C-D association must also be owned via the A-Bassociation. Case (4) is illegal since any elements of types C and D that are connectedat the C-D association are also connected at the A-B association, thereby creating acyclic composition and violating a model constraint. Thereby the following metamodelconstraint should be defined:

(∀p ∈ P · composite(p)⇒¬(∃q ∈ P · p⊂ q∧ composite(opposite(q)))We can find examples of the three first cases in the UML 2.0 metamodel, all in

Figure 73 of [16]. Case (1) can be found in the association-memberEnd association,case (2) in the owningAssociation-ownedEnd association and case (3) in the class-ownedAttribute association.

5 Related Work

Several others have formalized metamodeling languages and model layers. For exam-ple, Thomas Baar has defined the CINV language [6] using a set-theoretic approach,but our approach is more general in that we also support e.g. generalizations. The ben-efits of a set-theoretic approach is that it avoids a metacircularity whereby one (par-tially) needs to understand the language to be able to learn the language. José Álvarez,

D '

AC

abd{sdRmdb1

D '

AC

abd{sdRmdb1

(1) (2)

D '

AC

abd{sdRmdb1

D '

AC

abd{sdRmdb1

(3) (4)

Fig. 8. Subsetting with Composite and Noncomposite Properties

Andy Evans and Paul Sammut describe such a static object-oriented metacircular mod-eling language [4], and the Metamodeling Language Calculus by Tony Clark, AndyEvans and Stuart Kent is another very sophisticated one. Akehurst, Kent and Patrascoiupresent in [1] the structure of a metamodel and its semantics using OCL. Our rationalefor not using OCL is that it depends on a metacircular modeling framework.

However, our main contribution comes from the definitions of property subsetsin a language with multiple inheritance, which neither metamodeling nor traditionalobject-oriented language descriptions explain. Several authors use association inheri-tance without defining exact semantics, and some say that it denotes covariance. An ex-ample of this covariant specialization is the multilevel metamodeling technique calledVPM by Varro and Pataricza [19], which also limits itself to single inheritance. Weargue that subsetting is not the same concept as covariant specialization, and requiresdifferent semantics.

Carsten Amelunxen, Tobias Rötschke and Andy Schürr are authors to the MOFLONtool [5] inside the Fujaba framework [13]. MOFLON claims to support subsetting, butno description of the formal semantics being used is included. It is not clear if their toolworks in the context of subsets between ordered slots, or with diamond inheritance withsubsetting.

Markus Scheidgen presents an interesting discussion of the semantics of subsets inthe context of creating an implementation of MOF 2.0 in [18]. To our knowledge, thishas been so far the most thorough attempt to formalize subsetting. However it is notclear if the work supports diamond subsets or ordered sets, both of which are used inthe UML 2.0 Infrastructure.

The object-oriented and database research communities are also researching a sim-ilar topic, although it is called relationship or association inheritance, or first-class rela-tionships. In [7], Bierman and Wren present a simplified Java language with first-classrelationships. In contrast to our work, they do not support multiple inheritance, bidi-

rectionality or ordered properties; all of these constructs are common in modeling andin the UML 2.0 specification. However, relationship links are explicitly represented asinstances, and they can have additional data fields (just like the AssociationClass ofUML). As the authors have noticed, the semantics of link insertion and deletion is notwithout problems.

Albano, Ghelli and Orsini present in [3] a relationship mechanism for a strongly-typed object-oriented database programming language. It also handles links as relation-ship instances, but without additional data fields. Multiple inheritance is supported, butordered slot contents are not.

We have implemented a metamodeling tool called Coral with the features describedin this article. Coral is open source and available at http://mde.abo.fi. The basicmodel operations over models containing subset properties are defined in [2]. Unfor-tunately, we know of no other modeling tools that support subsets as extensively asdiscussed in this article.

6 Conclusions

There are several new property characteristics described in MOF 2.0: subsets, (derived)unions and redefinitions. However, the OMG modeling standards do not describe theseconcepts in detail, not even informally, and therefore they cannot be applied in practice.In this article, we have formalized a simple modeling framework that supports subsetsand derived unions. It discusses the relevant model constraints that must be upheld byany operations on models. We have argued that arbitrary property redefinitions are nota safe extension mechanism and therefore we have not included this concept in ourapproach.

There are some limitations in the work presented in this article. Subsetting as pro-posed is restricted to slots with unique elements. Slots where the same element can oc-cur several times (bags) are not considered. Although bags can be defined in MOF 2.0,they are not used in the definition of UML 2.0. Also, we assume that a subset prop-erty should have the same ordering characteristic as its union property. However, wenotice that it is also possibly to mix these characteristics such that an ordered slot maybe a subset of an unordered slot. This extension is simple since it simply requires theweakening of the model constraints for slots.

In conclusion, we consider that there is an imminent need in the modeling commu-nity to standardize on one intuitive explanation and a rigorous formalization of subsetsand derived unions, so tools based on MOF 2.0 and UML 2.0 can be implemented andbe interoperable. This paper presents a proposal in this direction that we hope it can helpother researchers and tool developers to define a common understanding for MOF 2.0and UML 2.0.

Acknowledgments

The authors would like to thank Patrick Sibelius for insightful discussions. MarcusAlanen would like to acknowledge the financial support of the Nokia Foundation.

References

1. D. H. Akehurst, S. Kent, and O. Patrascoiu. A relational approach to defining and imple-menting transformations between metamodels. Springer International Journal on Softwareand Systems Modeling, 2(4):215–239, December 2003.

2. Marcus Alanen and Ivan Porres. Subset and union properties in modeling languages. Tech-nical Report 731, TUCS, Dec 2005.

3. Antonio Albano, Giorgio Ghelli, and Renzo Orsini. A Relationship Mechanism for aStrongly Typed Object-Oriented Database Programming Language. In Proceedings ofthe 17th Conference on Very Large Databases, Morgan Kaufman pubs. (Los Altos CA),Barcelona, 1991.

4. José Álvarez, Andy Evans, and Paul Sammut. MML and the Metamodel Architecture. InJon Whittle, editor, WTUML: Workshop on Transformation in UML 2001, April 2001.

5. Carsten Amelunxen, Tobias Rötschke, and Andy Schürr. Graph Transformations with MOF2.0. In Holger Giese and Albert Zündorf, editors, Fujaba Days 2005, September 2005.

6. Thomas Baar. Metamodels without Metacircularities. L’Objet, 9(4):95–114, 2003.7. Gavin Bierman and Alisdair Wren. First-class relationships in an object-oriented language.

In Workshop on Foundations of Object-Oriented Languages (FOOL 2005), January 2005.8. Frank Budinsky, David Steinberg, Ed Merks, Raymond Ellersick, and Timothy J. Grose.

Eclipse Modeling Framework. Addison Wesley Professional, August 2003.9. Giuseppe Castagna. Covariance and Contravariance: Conflict without a Cause. ACM Trans-

actions on Programming Languages and Systems, 17(3):431–447, May 1995.10. EMF Development team. The Eclipse Modeling Framework website. http://www.

eclipse.org/emf.11. Anneke Kleppe, April 2003. Discussion on the mailing-list [email protected]. Barbara Liskov. Keynote address - data abstraction and hierarchy. SIGPLAN Not., 23(5):17–

34, May 1988.13. Ulrich A. Nickel, Jörg Niere, and Albert Zündorf. Tool demonstration: The FUJABA en-

vironment. In Proceedings of the 22nd International Conference on Software Engineering(ICSE), pages 742–745. ACM Press, 2000.

14. OMG. Meta Object Facility, version 1.4, April 2002. Document formal/2002-04-03. Avail-able at http://www.omg.org/.

15. OMG. MOF 2.0 Core Final Adopted Specification, October 2003. Document ptc/03-10-04.Available at http://www.omg.org/.

16. OMG. UML 2.0 Infrastructure Specification, September 2003. Document ptc/03-09-15.Available at http://www.omg.org/.

17. OMG. UML 2.0 Superstructure Specification, August 2003. Document ptc/03-08-02, avail-able at http://www.omg.org/.

18. Markus Scheidgen. On Implementing MOF 2.0—New Features for Modelling Lan-guage Abstractions. July 2005. Available at http://www.informatik.hu-berlin.de/~scheidge/.

19. Dániel Varró and András Pataricza. VPM: A visual, precise and multilevel metamodelingframework for describing mathematical domains and UML. Journal of Software and SystemsModeling, 2(3):187–210, October 2003.

20. Andreas Winter, Bernt Kullbach, and Volker Riediger. An Overview of the GXL GraphExchange Language. In Revised Lectures on Software Visualization, International Seminar,pages 324–336, London, UK, 2002. Springer-Verlag.