Collaborative Building of an Ontology of Key Performance Indicators

Collaborative Building of an Ontology of KeyPerformance Indicators

Claudia Diamantini, Laura Genga, Domenico Potena, and Emanuele Storti

Dipartimento di Ingegneria dell’InformazioneUniversità Politecnica delle Marche

via Brecce Bianche, 60131 Ancona, Italy{c.diamantini,l.genga,d.potena,e.storti}@univpm.it

Abstract. In the present paper we propose a logic model for the repre-sentation of Key Performance Indicators (KPIs) that supports the con-struction of a valid reference model (or KPI ontology) by enabling theintegration of definitions proposed by different engineers in a minimaland consistent system. In detail, the contribution of the paper is as fol-lows: (i) we combine the descriptive semantics of KPIs with a logicalrepresentation of the formula used to calculate a KPI, allowing to makethe algebraic relationships among indicators explicit; (ii) we discuss howthis representation enables reasoning over KPI formulas to check equiva-lence of KPIs and overall consistency of the set of indicators, and presentan empirical study on the efficiency of the reasoning; (iii) we present aprototype implementing the approach to collaboratively manage a sharedontology of KPI definitions.

1 Introduction

Modern networked business paradigms require models offering a common groundfor interoperability and collaboration. Reference models have been establishedfor supply chains and other networked organizations [1,2], that include the de-scription of business processes structure and performances as described in alibrary of Key Performance Indicators (KPIs). These models are typically builtas a “consensus view” over the partners’ own views, which is reached mostly byhuman interaction. In the present paper we propose a logic model for the repre-sentation of Key Performance Indicators (KPIs) that supports the constructionof a valid reference model (or KPI ontology) by enabling the integration of defi-nitions proposed by different engineers in a minimal and consistent system. Theapproach is characterised by the representation of what we call the compositionalsemantics of KPIs, i.e. the meaning of KPIs is composed from the meaning oftheir subparts. As a matter of fact, KPIs are measures generated by means ofoperations like aggregation and algebraic composition. The aggregative struc-ture of KPIs is captured by the multidimensional model [3], whose formalizationis well studied in the Literature. The composite structure of KPIs refers to thecalculation formula by which an indicator is calculated as a function of other

indicators, for instance in the form of a ratio (e.g. the acceptance rate, or the ra-tio between income and investments, known as Return On Investment or ROI).The starting point of the paper is that the composite structure of an indicatoris at least as fundamental as aggregation to capture its semantics, and deservespecial attention.

To illustrate and motivate the approach, let us consider a scenario where ashared dictionary of KPIs is collaboratively managed. Knowledge Engineers andPerformance Managers belonging to networked organizations decide to collectKPI definitions for future reference use or for interoperability purposes. Eachautonomous entity proposes KPI terms, definitions and properties, then a con-sensus is achieved in order to decide the dictionary entries. Some typical usecases can be recognised in this process:

– Check of KPIs’ identity: this activity allows to establish if a KPI at handalready exists in the dictionary. Two KPIs can be defined to be identicalif they provide the same value for the same real phenomenon, expressedby some transactional data. Then, although a “sameAs” relationship can beestablished by analysing terms and definitions, the ultimate semantics of theindicator is given by its calculation formula.

– Introduction of a new KPI: this is the basic activity allowing the incremen-tal building of the dictionary. Inserting a new KPI should not introduceinconsistencies in the dictionary, hence this activity is naturally related toconsistency checking and enforcement mechanisms. For instance, trivial in-consistency can be generated when introducing the income defined as theratio between ROI and investments, when the ROI definition, as given be-fore, already exists in the ontology.

– Update and deletion of existing KPIs: as the result of coordination andcooperation, KPIs introduced by a single entity can be revised or even deletedfrom the dictionary. Update and deletion require proper mechanisms forinconsistency management as above.

– Browsing: it is the fundamental activity for the analysis of KPI properties.– Searching: a dictionary is typically explored to look up KPI definitions.

Smart search engines allow to quickly check the existence of a KPI. Be-sides traditional keyword-based search, the structure of the formula can beused as a key for searching.

– Consensus management and versioning: it refers to the organization of col-laborative work, in particular mechanisms for the convergence towards acommon uniform view and for tracing and roll-back of previous work.

In this scenario, the proposal of the present paper is to provide advancedsupport for the above mentioned use cases by leveraging on the representationand manipulation of the formula defining a KPI. In particular, the original con-tributions of the paper are:

– we combine the descriptive semantics of KPIs with their compositional se-mantics into an ontology called KPIOnto. The ontology introduces a logicalrepresentation of KPIs and their formulas, allowing to make the algebraicrelationships among indicators explicit;

– we introduce a framework for reasoning over KPI formulas. The theory in-cludes a set of facts describing KPIs as well as a set of rules formalizing thebasic mathematical axioms for equation resolution and formula manipula-tion. On top of these, a set of rules implements high-level semantic reasoningfor KPI identity and equivalence checking, ontology consistency checking,formula and dependencies inference. Experiments on execution time demon-strate the feasibility of the approach;

– through a prototype we also show how semantic reasoning supports the flex-ible management of a shared ontology of KPI definitions.

The rest of the paper is organised as follows: in the next Section we briefly reviewthe relevant Literature. Sections 3 and 4 introduce the model and related reason-ing services respectively. Section 5 is devoted to the evaluation of the approach.Section 6 provides some concluding remarks and discusses future work.

2 Related Work

As observed in [4,5], the development of enterprise models is of particular impor-tance for collaborative networked organizations, as the basis to align people, busi-ness processes and technology capabilities, and for the development of methodsand tools for better decision-making. The Supply Chain Operations Referencemodel (SCOR) [1] and the Value Reference Model (VRM)1 are the two most com-prehensive and widely adopted reference models for supply chain management.Both include a dictionary of KPIs, that is a list of KPI definitions, propertiesand relations with goals and processes. Many other independent dictionaries orlibraries were introduced by researcher and international public bodies2. Thesedictionaries and libraries witness the attention toward a systematisation and or-ganisation of the huge amount of existing KPIs, but cannot be considered as KPImodels due to their informal nature. Formal models of indicators were recentlyproposed [6,7,8] in the context of the performance-oriented view of organizations.Description logics and first-order sorted predicate logics are used to express onan axiomatic basis the relations among indicators, by means of predicates likecausing, correlated, aggregation_of, however no compositional semanticsis taken into account in these models. Furthermore, the models are conceived forthe definition of KPIs in a single process-oriented enterprise, and the issue of con-sistency management is not taken into account. In [9,10] the notion of compositeindicator is introduced. Composite indicators are represented in a tree structure,and their calculation with full or partial specification of the formula linking theindicator to its components is also discussed. Related to this work, in [11] theGoal-oriented Requirement Language is enriched with the concept of KPI, andmetadata information is added to express the formula. Being in the context ofconceptual models for requirement engineering, these papers are concerned withlanguages and methods for elicitation of specific requirements and do not deal1 http://www.value-chain.org/en/cms/19602 See http://kpilibrary.com/ for a comprehensive list of KPIs and related sources

with general reference models and their consistency. Moreover, formula repre-sentation does not rely on logic-based languages, hence reasoning is limited toad-hoc modules for direct formula evaluation, and no inference mechanism andformula manipulation is enabled.

In the data warehouse field, semantic representations were recently pro-posed, mainly with the aim to reduce the gap between the high-level businessview of indicators and the technical view of data cubes, supporting the au-tomatic generation of customised data marts and OLAP analysis. While theontological representation of the aggregation structure of KPIs is largely studied[12,13,14,15,16,17], the compound nature and consequent dependency among in-dicators is much less explored. In [18] the definition of formulas is specified byusing a proprietary script language. [19] defines an indicator ontology based onMathML Content to explicitly define the formula of an indicator in order toexchange business calculation definitions and allow for their automatic linkingto specific data warehouse elements through semantic reasoning. The approachadopted has strong similarities with our previous work [20]. These approachesare mostly targeted at single enterprises. Despite the increasing interest in col-laboration and networking there is still a lack of work in addressing data cubes indistributed and collaborative contexts. A proposal in this respect is [21], whereinteroperability of heterogeneous and autonomous data warehouses is taken intoaccount. Again, although a preliminary attempt to include formulas in the frame-work, the compositional semantics of KPIs is missed.

Finally, regarding ontology building, several contributions highlighted theimportance of collaborative work, as revealed by the development of dedicatedmethodologies and tools [22,23]. In this respect, a well-known issue is consistencymanagement during collaborative ontology building. Indeed, the more users con-currently edit the ontology, the more it is likely that a change makes the sharedontology inconsistent. Hence, proper mechanisms to solve inconsistencies needto be defined, to choose which changes to accept. In many cases, such a verifi-cation is mainly left to communication and coordination among users; anyway,some work exists which intends to provide a more effective support for such atask. For example in [24], the authors introduce OSHOQP (D), a language ofthe DL family for modular ontology building and describe an editor based oncommon wiki systems for the development of modular ontologies representedin such a language. In [25], the authors propose the SCOOP platform for thecollaborative building of general-purpose knowledge bases. The platform checksfor redundancies or conflicts generated by new axioms, providing mechanisms ofconflict resolution. These platforms provide support services that are similar tothe one proposed in this paper for KPI ontologies.

3 KPIOnto: an Ontology of Key Performance Indicators

From the analysis of dictionaries cited in the previous Section and the multidi-mensional model of data warehouse domain, we derived the main properties forthe development of KPIOnto, an ontology devoted to formally describe indica-

Fig. 1. KPIOnto: a fragment of the Indicator taxonomy.

tors. The core of the ontology is composed by a set of disjoint classes, that aredetailed in the following: Indicator, Dimension and Formula.

Indicator It is the key class of KPIOnto, and its instances (i.e., indicators)describe the metrics enabling performance monitoring. Properties of an indi-cator include name, identifier, acronym, definition (i.e., a plain text giving adetailed description of its meaning and usage), compatible dimensions (see nextparagraph), formula, unit of measurement chosen for the indicator (i.e., boththe symbol and the description, given by referring to the Measurement UnitsOntology3), business object and aggregation function:

Indicator ≡ ∀ hasDimension.Dimension ⊓∀ hasFormula.Formula ⊓ (=1 hasFormula) ⊓∀ hasUnitOfMeasure.UoM ⊓ (=1 hasUnitOfMeasure) ⊓∀ hasBusObj.BusinessObject ⊓ (=1 hasBusObj) ⊓∀ hasAggrFunction.AggrFun ⊓ (=1 hasAggrFunction)

Following the VRM model, in KPIOnto a taxonomy of KPIs (see Figure 1)is established. However, multiple-categorization is allowed by the ontology, thussupporting more efficient indexing and searching functionalities. As an example,Figure 2 shows the main properties of the KPI PersonnelTrainingCosts.

Dimension A dimension is the coordinate/perspective along which a metric iscomputed. Following the multidimensional model, a dimension is usually struc-tured into a hierarchy of levels, where each level represents a different way ofgrouping elements of the same dimension [3]. For instance, Organization canbe grouped by virtual enterprise, enterprise, department, business unit, projectteam and person, while the Time dimension can be arranged in years, semesters,quarters, months and weeks. Each level is instantiated in a set of elements knownas members of the level, e.g. the company “ACME” or the month “2013-01”. Inorder to describe dimensions in KPIOnto, our approach is similar to the oneproposed in [26]. In particular, the top class is Dimension, which is composed ofa set of subclasses, one for each specific dimension, like Organization Dimension,Time Dimension, and so forth. Levels are represented as disjoint primitive sub-classes of the dimension (or the set of dimensions) they belong to, whereas each3 Measurement Units Ontology: http://idi.fundacionctic.org/muo/muo-vocab.

Fig. 2. KPIOnto: properties of indicator PersonnelTrainingCosts.

member is an instance of these classes, and the dimension’s hierarchy is repre-sented by a transitive partOf+ relation. An example is shown in the following:

VE ⊑ OrganizationDimension

Enterprise ⊑ OrganizationDimension ⊓ ∀partOf.VE ⊓(=1 partOf)

Department ⊑ OrganizationDimension ⊓ ∀partOf.Enterprise⊓(=1 partOf)

VirEnt1:VE, ACME:Enterprise, R&D:DepartmentpartOf(R&D,ACME), partOf(ACME,VirtEnt1)

from which the reasoner can infer partOf(R&D,VirEnt1). See also Figure 2,where an excerpt of the hierarchies for Organization and Time dimensions isshown. This ontological description enables usual OLAP operations over KPIs,namely roll-up and drill-down. The roll-up has a direct correspondence withthe partOf relationship, while the drill-down implies an opposite path w.r.t.that described by the partOf. Due to the existence of possible branched dimen-sions (e.g. Person→Department→Enterprise and Person→BusinessUnit→Enterprise), the derivation of this path is not straightforward. In our pro-posal, the drill-down of a member M at level L1 is the set of instances of thelower level (L2), which belongs to the class described by the following expres-sion: ∃ partof.M ⊓ L_2. Referring to the example of Figure 2, we have that∃ partof.{ACME}={R&D,Team_A}, so ∃ partof.{ACME}⊓ Department ={R&D} isthe drill-down of ACME along the branch Person→Department→Enterprise.

Formula A KPI can be either an atomic or a compound datum, built by com-bining several lower-level indicators. Dependencies of compound KPIs on itsbuilding elements are defined by means of algebraic operations, that is a Formulacapable to express the semantics of an indicator. A formula describes the way

the indicator is computed [27], and is characterised by the aggregation function,the way the formula is presented, the semantics (i.e., the mathematical meaning)of the formula, and references to its components, which are in turn (formulasof) indicators. Given a set {f1, ..., fn} of symbols of atomic indicators and a set{opa1

1 , ..., opann } of operators (in which aj is the arity of the j-th operator), we

define a well-formed indicator formula as a finite construction obtained from therecursive definition given below:

– fj is a well-formed indicator (e.g. NumberOfInternalIdeas or CycleTime);– if {F1, ..., Fk} are well-formed indicator formulas then opak(F1, ..., Fk) is a

well-formed indicator formula.

If we refer to algebraic operators like {+,−, ∗, /,^} then well-formed indi-cators can be represented by mathematical expressions in prefix notation, like+(−(A,B), ∗(C,D)). Formulas can be graphically represented by a forest of dis-joint lattices, where each node is a KPI and its children are the operands of itsformula, while leaves are atomic KPIs. The level of a node is defined by 1 + theminimum number of connections required to reach a root node; if more than oneroot is reachable, then the longest path is considered.

The formal representation and manipulation of the structure of a formula isessential in collaborative networked organizations to check inconsistencies amongindependent indicator definitions, reconcile indicators values coming from differ-ent sources, and provide the necessary flexibility to indicators management.

4 Reasoning-based Support Functionalities

Reasoning about KPIs is mainly based on the capability to manipulate formu-las according to strict mathematical axioms, like commutativity, associativityand distributivity of binary operators, and properties of equality needed to solveequations. Therefore, in order to define KPI reasoning functionalities we repre-sent both KPIOnto formulas and mathematical axioms in a common logic-basedlayer in Logic Programming (LP). While the descriptive properties of indicatorsand dimension are represented in OWL2-RL (based on a subset of OWL-DL)that can be directly mapped to LP, indicator formulas are implemented by usingtwo predicates, as follows:

– formula(Y,Expression,K) is a LP fact representing the formula related toan instance of the Indicator class in the KPIOnto. Y is the indicator name,Y=Expression is the formula of the indicator written using the infix notation,and K is a constant identifying whether Y is an atomic indicator or not. InKPIOnto an indicator is atomic if its formula is not defined on the basis ofother indicators.

– equation(Equation,X) represents a generic equation in the variable X.When a new indicator has to be added to the knowledge base, at first it isinserted using the equation predicate, and in this form it goes through cor-rectness and consistency checks, as explained below. If the check is positivea new formula fact is added to the LP theory, and hence to the ontology.

In the following subsections we introduce the main functionalities for reason-ing about KPIs, grouped by the tasks they are used for: Formula manipulation,which include formula rewriting and equation solving (Subsection 4.1); and Con-sistency Check, with functions to check if a new indicator is equivalent to orcoherent with others, useful for ontology management (Subsection 4.2).

4.1 Formula Manipulation

Manipulation of mathematical expressions is performed by specific predicatesfrom PRESS (PRolog Equation Solving System) [28], which is a formalizationof algebra in Logic Programming for solving symbolic, transcendental and non-differential equations. Its code can be represented as axioms of a first ordermathematical theory and the running of the program can be regarded as in-ference in such a theory. More in detail, given that predicates and functions inPRESS represent relationships among expressions of algebra, it can be inter-preted as a Meta-Theory of algebra. The predicates in which it is organised canmanipulate an equation to achieve a specific syntactic effect (e.g. to reduce theoccurrences of a given variable in an equation) through a set of rewriting rules.

The most interesting, in the context of this work, are those devoted to For-mula Manipulation, an essential reasoning functionality that consists in walkingthrough the graph of formulas to derive relations among indicators, and rewrit-ing a formula accordingly. We consider two main types of predicates: for thesimplification and for the resolution of equations, on which all the otherreasoning functionalities are built. The former are used both to compact andto improve the rendering of a formula, by individuating and collecting equiva-lent terms (e.g., a ∗ b and b ∗ a, or a ∗ a and a2, or a ∗ 1/b and a/b), and bydeleting unnecessary brackets (e.g. [()]=()). The main predicates of this kind aresimplify_term, which is used to simplify equations, and simplify_solution

that rewrites the final solution in a more understandable form.The second type of predicates enables the symbolic resolution of equations

by applying mathematical properties (e.g., commutativity, factorization), andproperties of equality. The number and kind of manipulations the reasoner isable to perform depend on the mathematical axioms we describe by means oflogical predicates. The resolution of the equation Equation in a given variableX starts from the predicate solve_equation, that is defined as follows:

solve_equation(Equation,X,X=Solution) :-

solve_equation_1(Equation,X,X=SolutionNotSimplified),

simplify_solution(SolutionNotSimplified,Solution).

where Solution is the solution of the equation.The predicate solve_equation_1 handles different kinds of equations, e.g.

linear and polynomial ones4. For lack of space, here we report only the methodfor solving linear equations, as follows:

4 At the present stage of the project, the formulas of all KPIs described in the KPIOntoare linear and polynomial equations, like the majority of indicators available inconsidered models (e.g., SCOR, VRM, Six-Sigma)

solve_equation_1(LeftMember=RightMember,X,Solution) :-simplify_term(LeftMember-RightMember,LeftMember1),

single_occurrence(X,LeftMember1=0),!,

position(X,LeftMember1=0,[Side|Position]),

isolate(Position,LeftMember1=0,X=Solution).

The resolution of linear equations is based on the isolation method, consistingin manipulating the equation and trying to isolate the variable X on the leftside of the equation; the right side is the requested solution. To this end theequation is firstly simplified by means of the simplify_term predicate, thenthe single_occurrence predicate is verified to check whether the equation islinear or not. The single_occurrence(X,LeftMem1=0) predicate is true if thevariable X is only in one term of LeftMember1. Note that after the simplification,terms of the same degree are grouped together, so if the equation is linear theterm of first degree of X is present only once. The other predicates are verified toactually implement the isolation method. The position predicate returns theside and list of positions of the variable X in the equation. For instance, the sideand position of x in the equation 3*(1/x)-5*y=0 are respectively “1” (left side)and “1,2,2”, i.e. the first position with respect to the minus operator, the secondargument of the product, and finally the second argument of the division. Atthis point, predicates for isolation are used:

isolate([N|Position],Equation,IsolatedEquation) :-isolax(N,Equation,Equation1),

isolate(Position,Equation1,IsolatedEquation).

isolate([],Equation,Equation).

The set of predicates isolax are needed to move a term from a side toanother, for instance multiplying or adding the same term to both sides. Intotal, in order to perform the formula manipulation functionalities more than900 predicates are used in the theory.

4.2 Consistency Check

An indicator is consistent with the ontology if three conditions are satisfied: itsdefinition must be unique in the ontology, its formula must not be equivalentto a formula already in KPIOnto and its formula must not contradict any otheralready defined formulas. The approach we adopt is such that after every inser-tion of an indicator, the ontology is guaranteed to be consistent. In this way,enterprises always have available an ontology containing valid and usable infor-mation. These characteristic is very important considering the domain of theontology: indicators are used to make decisions, often strategic. In the following,we discuss predicates we introduced to check these conditions.

Identical The first Consistency Check predicate is introduced to avoid insertingtwice the same definition in the ontology. The predicate is defined as follows:

identical(Equation,X,X=Solution) :-solve_equation(Equation,X,X=Solution),

formula(X,S,_),

Solution=S.

The reasoner firstly rewrites the Equation so that the variable X is only onthe left side. Then it searches the whole theory for a formula having the samestructure of the rewritten Equation. If such a formula exists, the predicate istrue, and the formula that is identical to the given Equation is returned.

Equivalence During KPI elicitation and ontology population, it is useful toindividuate and to manage duplications. In our scenario, two KPIs are duplicatedif their formulas are equivalent. If duplicates are found, various policies can beimplemented, either by merging the duplicates leaving only one definition foreach KPI in the ontology or by allowing multiple definitions by introducing asort of sameAs property among KPIs formulas. The second choice is useful toexplicitly represent alternatives, to link a formula to the enterprise that actuallyuses it, and to reduce the time to make inference.

Given two formulas F and G, they are equivalent if F can be rewritten as G, andvice versa, by exploiting Formula Manipulation functionalities. The equivalencecheck is implemented by means of the following predicate:

equivalence(Equation,X,Y) :-expand_equation(Equation,ExpandedEquation),

solve_equation(ExpandedEquation,X,X=Solution),

formula(Y,S,_), expand_equation(Y=S,Y=ES),

solve_equation(Y=ES,Y,Y=Solution2), Solution=Solution2.

where Y is the formula in the ontology that is equivalent to the Equation inthe variable X.

In order to avoid ambiguity, the check of equivalence is made at level of atomicindicators, where indicators are defined without referring to other indicators.Hence, an equation at this level can not be further expanded.To this end, theexpand_equation predicate recursively replaces each term in Equation with itsformula, until we obtain an ExpandedEquation that is formed only by atomicindicators. After this transformation the variable X could also be at the rightside, then the solve_equation is called to rewrite the equation. The same isdone for each formula in KPIOnto. If after these rewritings a formula Y existssuch that it is equal to the equation, then the equivalence predicate is satisfiedand the equivalent formula is returned.

Incoherence When new indicators are added to the ontology or existing indi-cators are updated or deleted, it is needed to check that such changes do notcontradict the ontology. To this end incoherence is introduced, that is basedon the idea that equivalence relations must be preserved between indicators andformulas. As a consequence, a formula F1 for a new indicator I1 is coherent withthe ontology if, whenever there exists a formula F for an indicator I already de-fined in the KPIOnto, such that F1 can be rewritten as F, then the equivalenceI≡I1 does not contradict the ontology. Otherwise we have an incoherence. Inthe Prolog theory, we introduced the incoherence predicate as follows:

incoherence(Equation,X,Y=S) :-expand_equation(Equation,ExpandedEquation),

solve_equation(ExpandedEquation,X,X=Solution),

formula(Y,S,_), expand_equation(Y=S,Y=ES),

solve_equation(Y=ES,X,X=Solution2), Solution \= Solution2.

Given an equation (X=Equation), the predicate returns whether the equa-tion is coherent with the ontology and, in case, the reason for the incoherence.The incoherence predicate is based on the definition of equivalence, as de-scribed above. Given the set of formulas in the ontology that can be expandedand rewritten as functions of X, each element that is different from the givenEquation (after the appropriate expansion) leads to incoherence and is returned.We have an incoherence also when, after a manipulation, the new equation as-sumes the form 0=0, i.e. the following clause is true:

incoherence(Equation,X,incosistent) :-expand_equation(Equation,ExpandedEquation),

solve_equation(ExpandedEquation,X,0=0).

Finally, when the new equation cannot be expanded but a non-atomic formulafor X exists, then the new equation contradicts the definition of such a formula.In other cases the coherence is verified.

incoherence(Equation,X,X=S) :-\+expand_equation(Equation,ExpandedEquation),formula(X,S,branch_node),!.

5 Evaluation

The goal of this Section is to provide some evidence on the efficiency and ef-fectiveness of the proposed approach. Efficiency is evaluated by reporting theexecution times of the main predicates on a real case study. Effectiveness isdemonstrated by discussing how the use cases of the collaborative maintenancescenario depicted in Section 1 are supported by a prototype using these predi-cates.

5.1 Performance Evaluation

The goal of this Subsection is to evaluate the performances of the logical the-ory implemented in Prolog. To this end, we measured the execution time ofeach predicate over the real-world ontology produced in BIVEE, an EU-fundedproject for the development of a platform to enable Business Innovation in Vir-tual Enterprise Environments5.

This ontology takes into account 356 KPIs for monitoring both productionand innovation activities, and it has been tested on two Virtual Enterprises in5 http://www.bivee.eu/

which two end-user partners are involved in. The ontology is characterised bya lattice of formulas with 281 connected nodes and 75 disconnected ones; thelatter represent indicators that are part of the ontology, but are not used tocompute other indicators. Indeed the ontology is a set of lattices with differentlevels (from 1 to 5) and operands (from 2 to 4). On average, each lattice has 3.14levels and there are 2.67 operands per indicator.

With respect to the predicates described in Section 4, the experimenta-tion focused only on those involving the most complex operations: identical,equivalence and incoherent, that are used for the consistency check of aformula. The others were left out of the test; in fact solve_equation andsimplify_* predicates are internally executed by those previously mentioned.

Test was carried on by evaluating the execution time of each predicate foreach connected node of the graph. Note that we discarded the 75 disconnectednodes for which no formula manipulation is needed, hence avoiding an under-estimation of execution times. In order to reduce variability in the computa-tion due to the operating system processes that could be running at the sametime on the machine, each predicate was executed 10 times and the relatedrunning times were averaged. The average time of each predicate was furtheraveraged over all nodes in the graph. Hence, in the following we return one aver-age value for each predicate with the related standard deviation in brackets. Inparticular for each formula(X,Exp,K) in the ontology we evaluated the follow-ing predicates: identical(X=Exp,X,S), equivalence(newPI=Exp+1,newPI,Y),and incoherence(newPI=Exp,newPI,S); where newPI is a new label which doesnot exist in the ontology. The introduction of the new label implies that the for-mula newPI=Exp+1 can not be equivalent to any formula in the ontology. Sim-ilarly, the formula newPI=Exp does not introduce inconsistencies. Using thesepredicates, we ensure that all formulas in the theory were taken into account,hence evaluating the worst situation. Experiments were carried on an Intel XeonCPU 3.60GHz with 3.50GB memory, running Windows Server 2003 SP 2.

As results, obtained the following execution times: 4 (±1.1) ms for identical,197 (±11.5) ms for equivalence and 201 (±10.5) ms for incoherence. Results showthat identical exhibits an almost constant execution time. This outcome is justi-fied by the fact that such a predicate actually performs only a unification, whichis independent from the level and number of operands of the node in the lattice.The results of the last two predicates depend on two main components: the timerequired to find the formula of a KPI in the ontology, and the time needed toevaluate the expand_equation predicate; the execution time of other predicatesare irrelevant. The former component depends on the number of KPIs in theontology, hence it is constant in our experiment. The latter component dependson the topology of the lattices, and in particular depends on the characteristicsof the sub-lattice of each node, i.e. the level of the node at hand and the numberof operands of every node in its sub-lattice. As a matter of facts, let us consideran indicator I at level L whose sub-lattice is formed by nodes with F operandseach: the evaluation of equivalence and incoherence for I involves FL formulasthat are needed to evaluate the expand_equation predicate. The higher variabil-

Fig. 3. Average execution time (ms) of the incoherence and equivalence predicates foreach pair (level, average number of operands).

ity of incoherence and equivalence predicates is due to this component. Figure 3reports the dependence of incoherence and equivalence predicates on the topol-ogy of the lattice. In particular, the histogram reports for each pair <level of thenode, average number of operands in the sub-lattice of the node> the averageexecution times of incoherence and equivalence predicates computed over everyKPI in the BIVEE ontology. It is noteworthy that changes in execution timesare noticeable only for levels higher than 4.

5.2 The Prototype

To support end-users in the management of KPIOnto in the context of BIVEEproject, we developed a web application named KPIOnto Editor6.

In order to show the support provided by the system, we refer to the followingcase study in which two enterprises, ACME and ACME2, are collaborating topopulate the ontology. Let us suppose that the collaboration has produced anontology formed by 11 KPIs linked by the following formulas:

– PersonnelCosts=NumHours*HourlyCost*(Overhead+1)

– Costs=TravelCosts+PersonnelCosts

– PersonnelTrainingCosts=HourlyCost*PersonnelTrainingTime

– TeachCosts=PersonnelTrainingTime*HourRate

– InvestmentInEmplDevelopment=PersonnelTrainingCosts+TeachCost

The following subsections describe use cases introduced in Section 1 togetherwith the relative support provided.

Introduction of a new KPI A new KPI can be introduced only if it isconsistent with the ontology, i.e. its formula must not be identical to anotherformula in the KPIOnto, it must not be equivalent to a formula already inontology and it must not contradict any other already defined formulas. To thisend, the formula of the new KPI is checked by using consistency predicate in

6 http://boole.dii.univpm.it/kpieditor

this order: identical, equivalence and incoherent. Only if these three arefalse the new KPI can be introduced.

To give an example, let us introduce two new indicators (1) ROI and (2)TotCostsEmpTrain, currently not defined, which are aimed to measure respec-tively the Return On Investment related to new ideas generated by the organi-zation and the total internal costs invested for training of employers:

ROI=ExpectedMarketImpact/Costs

TotCostsEmpTrain=TeachCosts+PersonnelTrainingTime*HourlyCost

Since there are no formulas for ROI and TotCostsEmpTrain in the ontology,the identical predicate returns false and the first step of consistency checkis passed. As concerns equivalence, the predicate is false for ROI, meaningthat its formula is not equivalent to any other formula in the ontology. On thecontrary, by using the formula about PersonnelTrainingCosts and after someformula manipulation, the reasoner discovers that TotCostsEmpTrain is struc-turally equivalent to InvestmentInEmplDevelopment. Hence, either the two en-terprises can refer to this last one to annotate their data without introducing anew indicator, or a sameAs relation linking the two KPIs can be defined. Finally,since incoherence for ROI returns false, the indicator is added to the ontologytogether with the atomic indicator ExpectedMarketImpact which had not beendefined yet in the ontology.

Figure 4 shows the interface used to define a new KPI. The form allows toinsert both the descriptive and the structural definition of a KPI. It is builtto limit possible syntactic errors, like mistyped dimension’s name or a formulareferring to undefined KPIs. In the lower part of the Figure, the box is devotedto support the definition of the mathematical formula; the user composes theformula by using a simple equation editor provided with a tool for searchingKPIs to include in the formula.

Update of an existing KPI This functionality is implemented in a way sim-ilar to the previous one. The formula of the KPI to update is temporarily re-moved from the ontology, then the procedure for introducing a new KPI is ex-ecuted. If the ontology with the updated formula remains consistent then weproceed with the update, otherwise the old formula is restored. For instance, letus consider that an user requires to update PersonnelTrainingTime, commit-ting a manifest error when typing the formula (inserting a sum instead of a di-vision): PersonnelTrainingTime=PersonnelTrainingCosts+HourlyCost. Thenew formula directly contradicts the definition of PersonnelTrainingCost to-gether with every other higher-level indicators depending on it, i.e. in this case isincoherent with InvestmentInEmplDevelopment and the ROI added just before.More complex cases can be managed as well, whose discussion is omitted for thelack of space.

The interface implementing this functionality is the same as that presentedin Figure 4, but the form is pre-compiled with values from the ontology.

Deletion of a KPI The procedure used to delete a KPI is more complex: wecan remove a KPI only if it does not make the ontology inconsistent. Hence,

Fig. 4. KPIOnto Editor: form for proposing a new indicator.

either the KPI is a root-node, so it is not part of any other formulas, or for eachformula in which the KPI appears we can infer at least a new formula which doesnot contains KPI. Since all inferable formulas of an indicator are equivalent, wereplace formulas containing the KPI to delete with one of the inferred formulasnot containing the KPI. Then, the indicator is removed from the ontology.

The procedure is transparent to the user and does not require a specificinterface. In the case the KPI can not be deleted, the system returns formulasthat can not be rewritten.

Other functionalities The search by term is simply implemented by queryingthe descriptive part of the ontology, and the search by formula is enabled by uni-fication of the requested pattern with all formulas in the ontology. For instance,searching for the pattern PersonnelTrainingTime*X, where X is a variable, thesystem will return the formulas of PersonnelTrainingCosts and TeachCosts.

In order to obtain details of a specific indicator, formula or member, theKPIOnto Editor provides two panels dynamically generated with informationretrieved from the ontology (see Figure 5 and Figure 6). The formula reportedin the indicator panel can be interactively browsed: clicking on a KPI, this isreplaced with its formula; moving on the name of the KPI, a tooltip with itsdescription is also shown. As for members, together with descriptive information,a graphical representation of the inferred partOf hierarchy is provided, as shown

Fig. 5. KPIOnto Editor: details of a KPI.

Fig. 6. KPIOnto Editor: details of a member.

in Figure 6. The two buttons in the upper-left corner of both Figures enable theupdate and deletion functionalities.

6 Conclusions

In this work we presented a novel model for the formalization of KPI proper-ties, which focuses on characterisation of formulas and reasoning over them. Theformal theory allows high-level reasoning supporting most of the activities fore-seen in a scenario of collaborative creation and management of a KPI ontology.The feasibility of the approach has been demonstrated by an assessment of theefficiency of reasoning mechanisms on a real case and by the implementationof a prototype enlightening the support for the user. Although reasoning mech-anisms ensure that ontology is always consistent, in order to correctly defineKPIs, the user needs an adequate knowledge of the structure of the ontologyand processes to be measured. Hence the prototype has been designed to beused by Knowledge Engineers, which should be properly trained within the col-laborating organizations. At present a Wiki collaborative model is assumed for

consensus management and versioning, which is proven to be a powerful modelto leverage collective intelligence and effectively integrate diverse points of viewin networked environments for unstructured textual contents. A study of theactual appropriateness of this model in the context of business and performancemanagers can be in order. Much work can be made to empower and extend thepresent proposal. Among the most promising directions, there is a tighter in-tegration of descriptive properties of KPIs with their compositional properties,e.g. by combining subsumption and formula manipulation in order to deduceproperties of abstract KPI classes. Another interesting direction is the analysisof similarities among formulas, in order to explain the difference among a set ofindicators or define similarities classes.

Acknowledgement

This work has been partly funded by the European Commission through theICT Project BIVEE: Business Innovation in Virtual Enterprise Environment(No. FoF-ICT-2011.7.3-285746).

References

1. Council, S.C.: Supply chain operations reference model. SCC (2008)2. Ellmann, S., Eschenbaecher, J.: Collaborative Network Models: Overview and

Functional Requirements. In: Virtual Enterprise Integration: Technological andOrganizational Perspectives. IGI Global (2005) 102–123

3. Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Complete Guide toDimensional Modeling. 2nd edn. John Wiley & Sons, New York, NY, USA (2002)

4. Seifert, M., Eschenbaecher, J.: Predictive performance measurement in virtual or-ganisations. In Camarinha-Matos, L.M., ed.: Emerging Solutions for Future Man-ufacturing Systems. Volume 159 of IFIP International Federation for InformationProcessing. Springer US (2005) 299–306

5. Camarinha-Matos, L.M., Afsarmanesh, H.: A comprehensive modeling frameworkfor collaborative networked organizations. Journal of Intelligent Manufacturing 18(2007) 529–542

6. Popova, V., Sharpanskykh, A.: Modeling organizational performance indicators.Information Systems 35 (2010) 505–527

7. del Río-Ortega, A., Resinas, M., Ruiz-Cortés, A.: Defining process performanceindicators: An ontological approach. In: Proceedings of the 2010 InternationalConference on On the Move to Meaningful Internet Systems - Volume Part I.OTM’10, Berlin, Heidelberg, Springer-Verlag (2010) 555–572

8. del Río-Ortega, A., Resinas, M., Cabanillas, C., Ruiz-Cortés, A.: On the definitionand design-time analysis of process performance indicators. Information Systems38 (2013) 470–490

9. Barone, D., Jiang, L., Amyot, D., Mylopoulos, J.: Composite indicators for businessintelligence. In Jeusfeld, M.A., Delcambre, L.M.L., Ling, T.W., eds.: ER. Volume6998 of Lecture Notes in Computer Science., Springer (2011) 448–458

10. Horkoff, J., Barone, D., Jiang, L., Yu, E., Amyot, D., Borgida, A., Mylopoulos, J.:Strategic business modeling: representation and reasoning. Software & SystemsModeling (2012)

11. Pourshahid, A., Richards, G., Amyot, D.: Toward a goal-oriented, business intelli-gence decision-making framework. In Babin, G., Stanoevska-Slabeva, K., Kropf, P.,eds.: MCETECH. Volume 78 of Lecture Notes in Business Information Processing.,Springer (2011) 100–115

12. Neumayr, B., Schütz, C., Schrefl, M.: Semantic enrichment of olap cubes: Multi-dimensional ontologies and their representation in sql and owl. In: On the Moveto Meaningful Internet Systems: OTM 2013 Conferences. Volume 8185 of LectureNotes in Computer Science. Springer Berlin Heidelberg (2013) 624–641

13. Niemi, T., Toivonen, S., Niinimäki, M., Nummenmaa, J.: Ontologies with semanticweb/grid in data integration for olap. International Journal of Semantic WebInformation Systems 3 (2007) 25–49

14. Huang, S.M., Chou, T.H., Seng, J.L.: Data warehouse enhancement: A semanticcube model approach. Information Sciences 177 (2007) 2238–2254

15. Priebe, T., and Pernul, G.: Ontology-Based Integration of OLAP and InformationRetrieval. In: Proc. of DEXA Workshops. (2003) 610–614

16. Lakshmanan, L.V.S., Pei, J., Zhao, Y.: Efficacious data cube exploration by se-mantic summarization and compression. In: VLDB. (2003) 1125–1128

17. Nebot, V., Berlanga, R.: Building data warehouses with semantic web data. De-cision Support Systems 52 (2012) 853–868

18. Xie, G., Yang, Y., Liu, S., Qiu, Z., Pan, Y., Zhou, X.: EIAW: Towards a Business-Friendly Data Warehouse Using Semantic Web Technologies. In: The SemanticWeb. Volume 4825 of Lecture Notes in Computer Science. Springer Berlin Heidel-berg (2007) 857–870

19. Kehlenbeck, M., Breitner, M.H.: Ontology-based exchange and immediate applica-tion of business calculation definitions for online analytical processing. In: Proc. ofthe 11th International Conference on Data Warehousing and Knowledge Discovery.DaWaK ’09, Berlin, Heidelberg, Springer-Verlag (2009) 298–311

20. Diamantini, C., Potena, D.: Semantic enrichment of strategic datacubes. In: Proc.of the 11th Int. Workshop on Data Warehousing and OLAP, ACM (2008) 81–88

21. Golfarelli, M., Mandreoli, F., Penzo, W., Rizzi, S., Turricchia, E.: OLAP QueryReformulation in Peer-to-peer Data Warehousing. Information Systems 37 (2012)393–411

22. Missikoff, M., Navigli, R., Velardi, P.: The usable ontology: An environment forbuilding and assessing a domain ontology. In: Proceedings of the First InternationalSemantic Web Conference on The Semantic Web. (2002) 39–53

23. Tudorache, T., Noy, N.F., Tu, S., Musen, M.A.: Supporting collaborative ontologydevelopment in Protégé. Springer (2008)

24. Bao, J., Honavar, V.: Collaborative ontology building with wiki@nt. a multi-agentbased ontology building environment. In: Proceedings of the 3rd InternationalWorkshop on Evaluation of Ontology-based Tools. (2004) 1–10

25. Pease, A., Li, J.: Agent-mediated knowledge engineering collaboration. In: Agent-Mediated Knowledge Management. Springer (2004) 405–415

26. Neumayr, B., Schrefl, M.: Multi-level conceptual modeling and owl. In: Proceedingsof the ER 2009 Workshops on Advances in Conceptual Modeling - ChallengingPerspectives, Berlin, Heidelberg, Springer-Verlag (2009) 189–199

27. Diamantini, C., Potena, D.: Thinking structurally helps business intelligence de-sign. In D’Atri, A., al., eds.: Information Technology and Innovation Trends inOrganizations. Physica-Verlag HD (2011) 109–116

28. Sterling, L., Bundy, A., Byrd, L., O’Keefe, R., Silver, B.: Solving symbolic equa-tions with press. J. Symb. Comput. 7 (1989) 71–84

Collaborative Building of an Ontology of Key Performance Indicators

Documents

Transcript of Collaborative Building of an Ontology of Key Performance Indicators