Reliability assessment of web applications

17
Reliability Assessment of WEB Applications V. S. Alagar O. Ormandjieva Department of Computer Science Concordia University Montreal, Quebec H3G 1M8, Canada Phone: +1(514) 848-7810 alagar,ormandj @cs.concordia.ca February 18, 2002 Abstract The paper discusses a formal approach for specifying time-dependent Web applications and proposes a Markov model for reliability prediction. Measures for predicting reliability are calculated from the formal architectural specification and system configuration descriptions. Keywords: reliability prediction, software measurement, Markov model. 1 Introduction The reliability of a software system is defined in [IEEE90] as the ability to perform the required functionality under stated conditions for specified period of time. In this paper the software system under discussion is a Web-based system. Web is a large and complex distributed system whose heterogeneous components interact in various ways to achieve the result of an application. Often, the performance of an application initiated at a site is rated as good if the server at that site is robust and links are not broken. Such a rating does give a subjective qualitative assessment, but does not provide a scientific quantitative measurement of the reliability of the site. This paper proposes a methodology for an assessment of quality of Web application components through reliability prediction, when a formal model of the Web application could be specified in an Objected-oriented formalism. Many techniques exist to test and statistically analyze traditional software. However, these methods can not be readily applied to a Web environment. In a recent paper Kallepalli and Tian [KT2001] have surveyed the characteristics of Web applications and usage and proposed a statistical testing method for Web applications. Their approach relies on usage and failure information collected in the log files. Web failure is defined as the inability to correctly deliver information or documents required by Web users. Based on this definition of failure, they classify types of failures and provide a method for testing source or content failures. We complement their work by offering a formal time-constrained model of the Web on which testing and reliability analysis can be done. This work is supported by grants from Natural Sciences and Engineering Research Council, Canada and Concordia University Graduate Fellowships.

Transcript of Reliability assessment of web applications

Reliability Assessment of WEB Applications�V. S. Alagar O. Ormandjieva

Department of Computer ScienceConcordia University

Montreal, Quebec H3G 1M8, CanadaPhone: +1(514) 848-7810falagar,[email protected]

February 18, 2002

Abstract

The paper discusses a formal approach for specifying time-dependent Web applications and proposesa Markov model for reliability prediction. Measures for predicting reliability are calculated from theformal architectural specification and system configuration descriptions.Keywords:reliability prediction, software measurement, Markov model.

1 Introduction

Thereliability of a software system is defined in [IEEE90] as the ability to perform the required functionalityunder stated conditions for specified period of time. In thispaper the software system under discussion isa Web-based system. Web is a large and complex distributed system whose heterogeneous componentsinteract in various ways to achieve the result of an application. Often, the performance of an applicationinitiated at a site is rated as good if the server at that site is robust and links are not broken. Such a rating doesgive a subjective qualitative assessment, but does not provide a scientific quantitative measurement of thereliability of the site. This paper proposes a methodology for an assessment of quality of Web applicationcomponents through reliability prediction, when a formal model of the Web application could be specifiedin an Objected-oriented formalism.

Many techniques exist to test and statistically analyze traditional software. However, these methodscan not be readily applied to a Web environment. In a recent paper Kallepalli and Tian [KT2001] havesurveyed the characteristics of Web applications and usageand proposed a statistical testing method forWeb applications. Their approach relies on usage and failure information collected in the log files.Webfailure is defined as the inability to correctly deliver informationor documents required by Web users.Based on this definition of failure, they classify types of failures and provide a method for testingsourceor contentfailures. We complement their work by offering a formal time-constrained model of the Web onwhich testing and reliability analysis can be done.�This work is supported by grants from Natural Sciences and Engineering Research Council, Canada and Concordia UniversityGraduate Fellowships.

Quality assurance and reliability assessment for Web applications should focus on the prevention of Webfailures or the reduction of chances for such failures. Consequently, we contend that early reliability as-sessment is necessary for the reduction of testing efforts,and for ensuring a level of operational reliability.We propose an early analysis on the formal architecture model of the Web application. This uses a Markovmodel, which can adapt to changing system configurations that satisfy the architectural design. We mayview Web applications as Markov systems, in which state changes occur with certain probabilities. Fromthe Markov model of an application, we can calculate a predictive measurement of reliability. Markovmatrices for individual Web components can be constructed from log files, the source used for statisticaltesting in [KT2001]. We give methods for calculating Markovmatrices for synchronously interacting Webcomponents from the Markov matrices of individual components. Synchrony hypothesis is that two Webcomponents that interact on a shared message will change their status (states and associated information)simultaneously. In a typical application, several Web components collaborate to achieve a task. It is impor-tant to assess the reliability of every collaboration in an application. We provide a method to compute thereliability of the whole system from the reliability measures of the collaborations in the system.

2 Web and Markov Models: Basic Concepts

The Web is a large network of interconnected components. Conceptually, it is a graph where each vertex(node) is a computer system providing an interface to the other nodes in the network. A Web applicationis multi-layered, with the user at the top of the layer and theinformation source at the bottom of the layer.A user interacts with the nodes in the Web through a browser, which has links to the home pages at theinterfaces of the nodes in the network. The sever at a node provides the services for controlled navigationfor accessing and retrieving information from the information sources at that node. We model the Webcomponents asUser, Browser, and Serverclasses. Objects, instantiated from the classes, interactin ameaningful way through messages. The behavior of objects ina class is captured by ahierarchical labeledtransition system with finite number of states. A state represents an operational high-level unit. A transitionbetween two states may be labeled by a message shared by objects of different classes, or by an event internalto the object. A state, when complex, is itself a hierarchical labeled transition system, with its substates andtransitions defined by the buttons and the whistles specific to that state. That is, a transition from a substate� to another substate�0 of an object is implicitly labeled by the link name on the pageassociated with�.

Some Web applications may put time constraints on the navigation paths within their systems. This istypical when secure information or time-varying information is to be made available. For instance, considerthe home page of a hypothetical on-line brokerage systemHOBS. A user may be able to reach the homepage ofHOBS using a browser. However, the user must be authenticated to get services at the site. Onceauthenticated, the user may be authorized to do one or more business activity at each state of the serverobject. A state may include secure information such asuser-id, account-number, andaccount-balance.Typical states of theHOBS system where time may plays a role can beStock Trading, Mutual Fund Trading,Account Overview, Positions, Quotes and Research, Financial Planning, andServices.

The architecture ofHOBS design may allow the user to explore the substates of a state or change to adifferent state as specified by the links on the current page.For instance, the hierarchy of stages rooted atthe stateStock Tradingmay impose timing constraints on the information displayedat its substates. Afterreviewing an order at one of its substates, the user may be allowed tochange, reviewor cancelthe order.

2

If the activity at the statereviewis not completed within a certain amount of time, the backward transitionmay be disabled. There are two reasons for this: (1) the page contains secure information, and (2) theinformation, such as stock price is a time-dependent value.Another instance where time plays a role iswhen the user fails to interact for a certain period of time ina state. The system, after waiting for a periodof time, may force a new log in session. These instances illustrate the failure of the system to deliver theinformation requested by the user. However, this type of failure is not a fault of the system. The behavior ofthe system deviates from the user-expected behavior of the system, yet the system behaves according to thetime-constrained functionality imposed by system requirements. In order to model such applications andtheir reliability our formal model includes time constraints.

2.1 Markov models

Markov models are one of the most powerful tools available toengineers and scientists for analyzing com-plex systems. Analysis of Markov models yield results for both the time-dependent evolution of the systemand the steady state properties of the system. The Markov property states that given the current state of thesystem, the future evolution of the system is independent ofits history.

The Markov model of a Web component may be represented by a state diagram. The states represent thestages in the Web component that are observable to the users and the transitions between states have assignedprobabilities. The probabilities are calculated from the usage and related failure information collected in thelog file that maintains the Web site. We may use this data as initial transition probabilities. An algebraicrepresentation of a Markov model is a matrix, calledtransition matrix, in which the rows and columnscorrespond to the states, and the entrypij in thei-th row, j-th column is the transition probability for beingin statej at the stage following statei. We use transition matrix representation in reliability calculationalgorithms.

2.2 Discussion

Initial transition probabilities, obtained from various sources including log files and other subjective opinionsof experts can not be used for predicting the reliability of the system. We contend that the reliability shouldbe calculated from thesteady stateof the Markov system. A steady state or equilibrium state is one in whichthe probability of being in a state before and after transitions is the same as time progresses. Computing thesteady state vector for the transition matrix of a large system is hard. However, as in our approach, when thesystem is modularly constructed it seems possible to partition the system into smaller components, whichmight reduce the complexity of computing steady state vectors.

The formal model of the Web that we discuss in the next sectionis based on timed labeled transitionsystem semantics. From the state machine description of a Web component, it is possible to construct theMarkov machine corresponding to that model.

The organization of this paper is as follows. A formal model of the Web is given in Section 3. Section 4formally describes the method of modeling the Web application as a Markov system. Section 5 presents thereliability prediction measures. Section 6 concludes the paper with a discussion on our ongoing researchdirections.

3

3 Formal Model

Web is areactivesystem, characterized by the following two important properties:� stimulus synchronization:the Web (process) always reacts to a stimulus from its environment;� response synchronization:the time elapsed between a stimulus and its response is acceptable to therelative dynamics of the environment, so that the environment is still receptive to the response.

In addition, certain timing constraints are inherent in thedesign of many Web components. Hence, we maycharacterize Web as a real-time reactive system. Real-timeconstraints are strictly enforced in security relatedinformation browsing and retrieval. When security is related to safety, such real-time constraints arehardrequirements, in the sense that defaulting it would lead to dire consequences. In characterizing Web as real-time reactive systems, we have taken a stronger view than thetraditional one, where the Web is regarded asan interactive system. For many applications such a soft view is sufficient. However, when time constraintsand secure transactions are part of the Web application, ours is a more appropriate characterization. Themajor distinction between the two views is in the available synchronization mechanism: an interactivesystem will wait for an input from its environment; whereas,a reactive system is fully responsible forsynchronization with its environment. That is the underlying reason why certain stages in secure transactionscannot be accessed through backward navigation.

A Timed Reactive Objected-oriented Modelfor the development of real-time reactive systems is dis-cussed by Alagar et. al [AAM98]. We model the Web using this formalism. Abstractly, we model a Webcomponent as a class parameterized with port types. A port type is associated with asignature, a finite setof messages that can occur at a port of that type. We use the notation e! to emphasize thate is an outputmessage, and writee? to emphasize thate is an input message. A class may include attributes of two kinds:port identifies, and data types such asinteger, set, list, andqueue.

An object of the classA[L℄, whereL is the list of port types, is created by instantiating each port typein L by a finite number of ports and assigning the ports to the object. Any message defined for a port typecan be received or sent through any port of that type. For instance,A1[p1; p2 : �P ; q1; q2; q3 : �Q℄, andA2[r1 : �P ; s1; s2 : �Q℄ are two objects of the classA[�P;�Q℄. The objectA1 has two portsp1 andp2of type @P , and three portsq1, q2, q3 of type @Q. Both p1 andp2 can receive or send messages of type@P ; the portsq1, q2, andq3 can receive and send messages of type @Q. Sometimes the port parameters ofobjects are omitted in our discussion below. Messages may also have parameters with basic types.

An incarnationof an objectAi is a copy ofAi with a name different from the name of any other objectin the system, and with its port types renamed, if necessary.Several incarnations of the same object canbe created and distinguished by theirids. Lettingids to be positive integers,A1[1℄, A1[2℄ are two distinctincarnations of the objectA1. Every incarnation of an object retains the same port interfaces. For instanceA1[1℄[a1; a2 : �P ; b1; b2; b3 : �Q℄ andA1[2℄[p1; p2 : �P ; q1; q2; q3 : �Q℄ are two distinct incarnationsof the objectA1[p1; p2 : �P ; q1; q2; q3 : �Q℄. The contexts and behavior of incarnations of an objectAi are in general independent. The context for the incarnationAi[k℄ is defined by the set of applicationsin which it can participate. Hence the context of an incarnation effectively determines the objects withwhom it can interact and the messages it can use in such an interaction. For instance, the incarnationsA1[1℄[a1; a2 : �P ; b1; b2; b3 : �Q℄ andA1[2℄[p1; p2 : �P ; q1; q2; q3 : �Q℄ can be plugged into two distinct

4

configurations for two distinct applications in a system. Inthe rest of the paper we use the term object tomean incarnation as well.

The behavior of objects in a class is specified by a finite statemachine, augmented with state hierarchy,logical assertions and timing constraints for transitions. A complex state is an encapsulation of a statehierarchy, and hence another finite state machine, with an initial state, and which can include other complexstates. In our model, Web objects communicate using a synchronous message passing mechanism. Anexternal event in the system is either an input or an output event, which can only occur at an instanceof a specific port type. Events label the transitions betweenstates. Logical assertions on the attributesspecify a port condition, an enabling condition, and a post condition on each transition. Local clocks aredefined to enforce time constraints associated with a transition. Both time constraints and functionality areencapsulated in an object.

An abstract model of a Web system is specified as a collection of interacting Web components. AWeb component is an object instantiated from a generic class. A pair of objects in this collection interactsynchronously through shared messages. These messages occur at thecompatibleports. Two ports in asystem are compatible if the set of input messages at one portis equal to the set output messages at the otherport. A port link connects two compatible ports. A port link is an abstraction of communication mechanismbetween the objects associated with the ports. Since the signature of ports are well-defined, the port linkseffectively determine the set of all valid messages that canbe exchanged among the objects in a subsystem.

3.1 Operational Semantics

Web objects communicate through messages. A message from anobject to another object in the system iscalled asignaland is represented by a tuplehei; pi; tii, denoting that the eventei occurs at timeti, at a portpi. Thestatusof an object at any timeti is the tuple(�;~a;R), where the current state� is a simple state,~a isthe assignment vector for attributes, andR is the vector of outstanding reactions. Acomputational stepof anobject occurs when the object with status(�;~a;R), receives a signalhei; pi; tii and there exists a transitionspecification that can change its status. A computation of an objectA is a sequence, possibly infinite,

of alternating statuses and signals,OS0 he0;p0;t0i! OS1 he1;p1;t1i! : : :. Typically, the Web system is non-terminating; consequently, a computation is in general an infinite sequence. The set of all computationsof an objectA is denoted byComp(A). The computation of the Web system is an infinite sequence ofsystem statuses and signals that effect status changes [AAM98]. A period is a finite subsequence of the Webcomputation such that it starts with some initial state and finishes with its next appearance in the computationsequence.

3.2 A Simple Model of the Web

We abstract the multi-layered architecture of Web applications into three Web components:User, Browser,andServer. This abstraction, although is simple, is quite expressiveand sufficient to illustrate the reliabilitycalculation. Extension of our approach to more complex and detailed models are not difficult.

In our model, we assume that several users (clients) may use abrowser independently and concurrentlyto access information from a server. For simplicity, we assume that one browser is associated with a server.Once again, this restriction is only for the sake of simplicity of exposition, and can be generalized. A user

5

Server<<GRC>>

@S

events : Set = {Permit?,StopPermit?}

<<PortType>>

@G

events : Set = {Permit!,StopPermit!}

<<PortType>> Browser

<<DataType>> inSet : Set[@P,PSet]

<<GRC>>

User

<<PortType>> cr : @C

<<GRC>>

@P

events : Set = {Get?,Exit?}

<<PortType>>

@C

events : Set = {Get!,Exit!}

<<PortType>>

Figure 1: Class Diagram for User, Browser and Server Entities.

chooses the server of his choice and initiates a request to a server. That is, the user sends a message tothe corresponding browser, which then commands the server to allow the connection. When the last userrequesting access to a server disconnects, the browser commands the server to close. During this period,the user- browser-server interaction must work without fault. The security (expressed as a safety property)requires that the operation of the system satisfies certain timing constraints, the server remains open, andprovides the requested information (not violating time constraints) during every period of transaction.

A high-level class structure diagram of the model in UML-based notation is shown in Figure 1. TheUserclass has one port type with signaturefGet!; Exit!g. TheBrowserclass has two port types, @P withsignaturefGet?; Exit?g, and @G with signaturefPermit!; StopPermit!g. TheServerclass has one porttype with signaturefPermit; StopPermitg. The figure shows that a port type, modeled as a class, hasan aggregation relationship with the class for which it is intended. An association relationship betweencompatible port types is shown. A port identifier is declaredas a variable of type @C in Userclass, and avariable of typeSet is introduced inBrowserclass. Time constraints and functionality of objects of classesare described in statechart diagrams. A formal specification includes structural and behavioral information.

User Model

The statechart diagram forUser is shown in Figure 2(a). The significant states of aUser object areidle,toAccess, access, leave. At any instant, a user is in one of these states. In theIdle state, a user has notinitiated any request. To access the server, the user sends the eventGet to the browser used by it in stateIdle, and changes his state totoAccess. In statetoAccess, the attributecr is set topid, the identifier of theport whereGetoccurs. This transition is the constraining transition fortwo time constraints, labeledTCvar1andTCvar2. Within 2 to 4 units of time of outputting the request (specified by TCvar1), the user accessesthe server. That is, the user changes his state toaccessby initiating the internal eventIn. The stateleaveis reached when the user has retrieved the information requested, and this happens within 6 units of time(specified by TCvar2) from the instant the user requested access to the server. The user sends the messageExit to the browser and reaches the initial state. The formal specification of theUser class is shown inFigure 2(b).

6

S1: idle

S3: accessS4: leave

Exit[ pid=cr && true && TCvar2<6 ]

S2: toAccess

In[ true && true && TCvar1>2 AND TCvar1<4 ]In[ true && true && TCvar1>2 AND TCvar1<4 ]

Out

Get / cr’=pid && TCvar1=0 AND TCvar2=0

(a)UserStatechart

Class User [@R]Events: Get!@R, Out, Exit!@R, InStates: *idle, access, leave, toAccessAttributes: cr:@CTraits:Attribute–Function:

idle�> fg; access�> fg;leave�> fg; toAccess�> fcrg;

Transition–Specifications:R1:<idle,toAccess>; Get(true); true=>cr0=pid;R2:<access,leave>; Out(true); true=> true;R3:<leave,idle>; Exit(pid = cr); true=> true;R4:<toAccess,access>; In(true); true=> true;

Time–Constraints:TCvar2: R1, Exit, [0, 6],fg;TCvar1: R1, In, [2, 4],fg;

end

(b) Userclass Specification

Figure 2:UserClass

C2: activate

C4: deactivate C3: monitor

Get[ NOT(member(pid,inSet)) && true ] / inSet’=insert(pid,inSet)

C1: idle

Get[ NOT (member(pid,inSet)) &&true ] / inSet’=insert(pid,inSet)

Get / inSet’=insert(pid,inSet) &&TCvar1=0

Permit[ true && true && TCvar1>0 AND TCvar1<1 ]

Exit[ member(pid,inSet) && size(inSet)=1 ] / inSet’=delete(pid,inSet) && TCvar2=0

Exit[ member(pid,inSet) && size(inSet)>1 ] / inSet’=delete(pid,inSet)

StopPermit[ true && true && TCvar2>0 AND TCvar2 < 1 ]

(a) BrowserStatechart

Class Browser [@P, @Y]Events: Permit!@Y, Get?@P, StopPermit!@Y, Exit?@PStates: *idle, activate, deactivate, monitorAttributes: inSet:PSetTraits: Set[@P,PSet]Attribute–Function:

activate�> finSetg; deactivate�> finSetg;monitor�> finSetg; idle�> fg;

Transition–Specifications:R1:<activate,monitor>; Permit(true);

true=> true;R2:<activate,activate>; Get(NOT(member(pid,inSet)));

true=> inSet0 = insert(pid,inSet);R3:<deactivate,idle>; StopPermit(true);

true=> true;R4:<monitor,deactivate>; Exit(member(pid,inSet));

size(inSet) = 1=> inSet0 = delete(pid,inSet);R5:<monitor,monitor>; Exit(member(pid,inSet));

size(inSet)> 1=> inSet0 = delete(pid,inSet);R6:<monitor,monitor>; Get(!(member(pid,inSet)));

true=> inSet0 = insert(pid,inSet);R7:<idle,activate>; Get(true);

true=> inSet0 = insert(pid,inSet);Time–Constraints:

TCvar1: R7, Permit, [0, 1],fg;TCvar2: R4, StopPermit, [0, 1],fg;

end

(b) Browserclass Specification

Figure 3:Browserclass

7

G1: idle G2: toOpen

G3: toClose G3: opened

Allow[ true && true && TCvar1>0 AND TCvar1 < 1 ]

StopPermit / true && TCvar2=0

Permit / true && TCvar1=0

DisAllow[ true && true && TCvar2 >1 AND TCvar2< 2 ]

(a)ServerStatechart

Class Server [@S]Events: Permit?@S, Allow, DisAllow, StopPermit?@SStates: *Idle, toClose, toOpen, openedAttributes:Traits:Attribute–Function:

Idle�> fg; toClose�> fg;toOpen�> fg; opened�> fg;

Transition–Specifications:R1:< Idle,toOpen>; Permit(true); true=> true;R2:<toOpen,opened>; Allow(true); true=> true;R3:<toClose,Idle>; DisAllow(true); true=> true;R4:<opened,toClose>; StopPermit(true); true=> true;

Time–Constraints:TCvar1: R1, Allow, [0, 1],fg;TCvar2: R4, DisAllow, [1, 2],fg;

end

(b) Serverclass Specification

Figure 4:Serverclass

Browser Model

The statechart diagram forBrowseris shown in Figure 3(a). ABrowserobject can be in one of four states:idle, activate, monitor, deactivate. In its initial stateIdle the object receives theGet message from a userobject. In response, it synchronously changes its state toactivateand includes the identifier of the port wherethe message was received in its attributeinSet. In activatestate, the browser object may either receive theGetmessage from another user object or may send the messagePermit to the server object associated withit. In the former case, it includes thepid where the message was received to its attributeinSet, and stays inthe same state. In the later case, it changes its state tomonitorwithin 1 time unit from the instant it receivedthe firstGetmessage. In statemonitor three possible situations arise:

1. The object receives theGetmessage from another user object. The response is identicalto its responsefor theGetmessage in stateactivate.

2. The object receives theExit message from a user. In response, it removes the user object from inSet,and as a result of this deletion ifinSetis empty (signifying that there are no more users) it changesitsstate todeactivateor it stays in the same state.

Within 1 and 2 units of time of reachingdeactivatestate, it sends the messageStopPermitto the server andchanges its state toidle.

Server Model

The statechart diagram forServeris shown in Figure 4(a). AServerobject can be in one of four states:idle,toOpen, opened, toClose. Initially the server is inidle state. Upon receiving the eventPermit, it changes itsstate synchronously with theBrowserobject and goes totoOpenstate. Within one unit of time of receiving

8

thePermitevent, theServerobject initiates the internal eventAllow and reaches the stateopened. It staysin that state until receiving the eventStopPermitfrom the Browserobject. Within 1 and 2 units of timeof receivingStopPermit, theServerwill return to idle state fromtoClosestate. The formal specificaiton isshown in Figure 4(b).

Server1 : Serveruser1 : User

Browser1 : Browser@P1 : @P

@S1 : @S@C1 : @C

@G1 : @G

(a) Collaboration Diagram - Simple System

SCS UserServerBrowserIncludes:Instantiate:

Server1::Server[@S:1];User1::User[@C:1];Browser1::Browser[@P:2, @G:1];

Configure:Browser1.@G1:@G<–> Server1.@S1:@S;Browser1.@P1:@P<–> User1.@C1:@C;

end

(b) System Specification - Simple System

user2:User

@C2: @C @C3: @C @C4: @C @C5: @C @C6: @C

@P6: @P@P5: @P@P4: @P@P3: @P@P2: @P@P1: @P

Browser2:Browser

@G2: @G

@S2: @S

Server2:Server

@S1: @S

@G1: @G

Browser1:Browser

user1:User

@C1: @C

user3:User user4:User user5:User

Server1:Server

(c) Collaboration Diagram - Complex System

SCS UserServerBrowserIncludes:Instantiate:

Server1::Server[@S:1];Server2::Server[@S:1];User1::User[@C:1];User2::User[@C:1];User3::User[@C:2];User4::User[@C:1];User5::User[@C:1];Browser1::Browser[@P:3, @G:1];Browser2::Browser[@P:3, @G:1];

Configure:Browser1.@G1:@G<–> Server1.@S1:@S;Browser2.@G2:@G<–> Server2.@S2:@S;Browser1.@P1:@P<–> User1.@C1:@C;Browser1.@P2:@P<–> User2.@C2:@C;Browser1.@P3:@P<–> User3.@C3:@C;Browser2.@P4:@P<–> User3.@C4:@C;Browser2.@P5:@P<–> User4.@C5:@C;Browser2.@P6:@P<–> User5.@C6:@C;

end

(d) System Specification - Complex System

Figure 5: System Configuration Models and Specifications

A system configuration specification defines objects instantiated from the three classes and their inter-actions. Figure 5(a) is a collaboration diagram in UML stylefor a linear system with one user object, oneserver object, and one server object. The formal specification for this system is shown in Figure 5(b). Amore complex system, that is non-linear, consisting of five users, two browsers and two servers is shownin Figure 5(c). In this configuration,user3is allowed to access both browsers, while the other user objectsinteract with only one browser. The formal specification forthis subsystem configuration is shown in Figure

9

5(d). In both specifications, there is no included subsystem. In theInstantiate section, objects are created,and in theConfigure section compatible ports of objects are linked. The behavior of a system configurationspecification can be simulated by applying the operational semantics to the system starting in some initialconfiguration.

4 Markov Models

We construct the Markov model of a Web system in three steps. In the first step we construct the Markovmodels for Web objects. In the second step we construct the Markov models for every pair of interactingobjects in the system configuration specification. Finally in the third step we construct the Markov modelfor the fully configured system.

4.1 Step 1: Markov Models for Objects

We associate with each Web object in the architecture another finite state machine, called itsMarkovmodel.The states in the Markov model of an object are the states of the object in the formal design. A transitionbetween two states in the Markov model is defined only if thereexists at least one transition between thosestates in the statechart of the object. For instance, the states and transitions ofUserobject Markov model arethe same as those inUserstatechart but for labels and constraints. In the absence ofstatistical informationgathered by experts on the usage and failure, we will assume that all the external events have equal proba-bility in each state. For the transition from statei to statej in the Markov model, a fixed probabilitypij ofit going into statej at the next time step is calculated as follows:

1. The initial probabilities for all the transitions in the state machine of the reactive object are calculated.The algorithm for calculating such probabilities for a state is based on the following assumptions: 1)all external events that can happen at the state have the sameprobability; 2) all internal events that canhappen at the state have the same probability, and (3) these are in general different.

2. In case there is more than one transitionfl1; : : : ; lng of the same type (shared/internal) from statei tostatej , then the above mentioned transitions are substituted by one whose probability isP = 1 � (1� Pfl1g)� : : : (1� Pflng)

3. The probabilities of all the transitions for a state have to sum to1.

The Markov models and transition probabilities forUser, Browser, andServerobjects are shown in Figure 6.

4.2 Step 2: Markov Model for Object Pairs

The interaction between two objects is due to shared events.We compute the state machine for an interactingpair of objects and compute the Markov model with transitionprobabilities from the transitions at each stateof the product machine.

10

p

p

p

p

12

23

34

41

S2

S3S4

S1

S1 S2 S3 S4

S4

S3

S2

S1 0 010

0

0

0 0

0 0

000

1

1

1

(a) User Class

p

p

p

p

12

23

34

41

G2

G3G4

G1

G1 G2 G3 G4

G4

G3

G2

G1 0 010

0

0

0 0

0 0

000

1

1

1

(b) Browser Class

p

pp

12

2341

C2

C3C4

C1

C1 C2 C3 C4

C4

C3

C2

C1 0 010

0

0

0

0

0001

1/2 1/2

3/4 1/4

p22

p33

p34

(c) Server Class

Figure 6: Markov Models

Algorithm for Transition Matrix for the Synchronous Product Machine

LetE1 andE2 be the sets of internal events in the statechartsP andQ of interacting objects, andF denotethe set of shared events. LetM1 andM2 be the transition matrices forP andQ. LetR be the synchronizedproduct machine ofP andQ. Algorithm SPM computes the transition matrixM of R by first computingthe synchronous product machineR, and next determining the transition probabilities for transitions in eachstate ofR. If all the transitions in a state are labeled by internal events or if all of them are labeled byshared events the probabilities are obtained by normalizing the probabilities in their respective machines.However, if both internal events and shared events occur at the state, the probabilities for the shared eventsare calculated first, and the remaining measure is distributed to transitions labeled by internal events.

11

Algorithm SPM

Step 1. p = 1; == row sum

Step 2. x1 = fej e is a shared event occurring at statei (P ) and at statej (Q) gx2 = fej e is an internal event occurring at statei (P )g [ fej e is an internal event occurring at statej (Q) gStep 3. If x1 6= ; == calculate probabilities for transitions due to shared events

thenNF = 0 (Normalization Factor);set1 = ;;Step 3.1 For each evente 2 x1 find the (set of) statesi0 (P ) andj0 (Q) such thati e! i0 , j e! j0Step 3.2 y = y \ fi0 ; j0g, if fi0 ; j0g =2 yStep 3.3 NF = NF +M1[i; i0 ℄�M2[j; j0 ℄Step 3.4 M [(i; i0); (j; j0 )℄ = M1[i; i0 ℄�M2[j; j0 ℄Step 3.5 set1 = set1 \ (j; j0 )

Step 4. If x2 6= ; == calculate probabilities for transitions due to internal events

thenNF 0 = 0 (Normalization Factor);set2 = ;;Step 4.1 For each evente 2 x2, if e 2M1 then

find the statei0 (M1) such thati e! i0 ;y = y \ fi0 ,jg if fi0 ,jg =2 y;M [(i; j)(i0 ; j)℄ = M1[i; i0 ℄; NF 0 = NF 0 +M [(i; j)(i0 ; j)℄;set2 = set2 \ (i; i0)else

find the statej0 (M2) such thatj e! j0 ;y = y \ fi; j0g if fi; j0g =2 y;M [(i; j)(i; j0 )℄ = M2[j; j0 ℄; NF 0 = NF 0 +M [(i; j)(i; j0 )℄;set2 = set2 \ (j; j0)Step 5. If x1 = ; ^ x2 = ;, the(i; j) row is deleted fromMStep 6.If x1 = ; ^ x2 6= ;

For each(i0 ; j0) 2 set2 do M [(i; j); (i0 ; j0)℄ = M [(i; j); (i0 ; j0)℄NF 012

S2,C3

S4,C3

0S1,C1

1

S1,C4

000001S1,C4

1

0

1

1

1

0

0

0

0

0

0

0

0

S4,C3 S3,C3

S2,C3S2,C2

S1,C4

S2,C2

S3,C3

S4,C3

0S1,C1

S1,C1 S2,C2 S2,C3 S3,C3

0

0 0

0 0 0

0

0

0

0

0

0

0

0

61

56 45

34

2312

P

P

P

P

P

P

Figure 7: Markov Model and State Transition Matrix for Synchronous Product of User and Browser

Step 7. If x1 6= ; ^ x2 = ;For each(i0 ; j0) 2 set1 do M [(i; j); (i0 ; j0)℄ = M [(i; j); (i0 ; j0)℄NF

Step 8. If x1 6= ; ^ x2 6= ; do

For each(i0 ; j0) 2 set2 doM [(i; j); (i0 ; j0)℄ = (1�NF )�M [(i; j); (i0 ; j0)℄NF 0Step 9. To fill in the matrixM with 0 where there are no entries

The Markov model and transition probability matrix for the synchronous product ofUser and Browserobjects is shown in Figure 7.

4.3 Step 3: Markov Model for a System

A partitioning method given in [O2002], is the basis of our discussion in this section. A system configu-ration, when partitioned, produces two types of subsystem components: (1) linear subsystem configuration,as shown in Figure 8(a), and (2) non-linear subsystem configuration as shown in Figure 9(a).

4.4 Case 1: Linear System

In a linear system, objects synchronize in the past. Ifo1; : : : ; on are objects in the linear system andM1; : : : ;Mn are respectively their transition matrices, then the transition matrix M of the linear systemis computed as follows:

1. ComputeM = M1 M2 (Apply Algorithm SPM)

2. for j = 3 to n computeM = M Mj (Apply Algorithm)

The Markov model and the transition matrix for the linear system (Figure 5(a)) are shown in Figure 8(b).

13

User ServerBrowserIni

synchronize

synchronize

(a) Linear Architecture

fS4 C3 G3S1 C1 G1 S2 C2 G1

S1 C1 G4

S1 C1 G1

S2 C2 G1

S3 C3 G3S2 C3 G3S2 C3 G2

S1 C4 G3

S4 C3 G3 S4 C3 G3

S3 C3 G3

S2 C3 G3

S2 C3 G2

S1 C1 G4S1 C1 G4

S1 C4 G3

S1 C4 G3

S1 C1 G1

S2 C2 G1

S2 C3 G2

S2 C3 G3 S3 C3 G3

P

P

P

P

P

12

23

34

45

56

67

89

P78P

P

0 0 0

0 0 0 0 01

1

1

1

0

0 0 0 0 0

00000

0 0 0 0 0

000000

1

0 0 0 0 00 0

00001

0

0

0

0

0

00

00

0

0

0

10

0

00

(b) Markov Model

Figure 8: User-Browser-Server Linear Model

4.5 Case 2: Non-linear System

In a non-linear system, as in Figure 9(a), several objects interact with an object. These interactions may beinitiated at different times. The synchronous product machine dynamically changes as and when users joinor leave the system, and hence the transition probability matrices also change, and should be recomputed.For one scenario, the synchronous product of a non-linear system and its corresponding Markov modelare illustrated in Figure 9(b). In the state machine diagram, interpret eachSi as a vector, with numberof components equal to the number of users in the system. For simplicity of discussion we assume thatusers join the system one at a time, but don’t leave the system. Let 0; k1; k2; : : : ; kn�1 be the intervals ofsuccessive arrivals of users. That is, forj > 1 the j-th user joins the systemkj�1(> 0) time units afterj � 1-st user joined the system. LetM1 be the transition matrix of the Markov model for the linear systemcomposed fromUser1; Browser1; Server1, andMk1 be its transition matrix at time stepk. The transitionmatrix forM(n), n � 2, users interacting with one browser and one server is calculated as follows:M(2) = Mk11 LM1M(3) = [M(2)℄k2LM1

...M(j) = [M(j�1)℄kj�1LM1, 2 � j � n

...

In the above calculation, the symbolL

denotes the direct product operator for matrices. The justificationfor direct product computation is based on the observations:

14

User User UserUser1 2 3 n

Server

Browser

(a) Non-Linear Architecture

S1 C3 G3

S1 C4 G3

S1 C4 G3

S2 C3 G2

S2 C3 G3

S3 C3 G3

S1 C4 G3S1 C3 G3 S1 C1 G4S4 C3 G3S3 C3 G3S2 C3 G3

S1 C1 G1

S2 C2 G1S1 C1 G1

S4 C3 G3

S1 C1 G4

S2 C2 G1

S2 C3 G2

S1 C3 G3

S3 C3 G3S2 C3 G3

S2 C3 G2

S2 C2 G1

S1 C1 G1 S1 C1 G4

S4 C3 G3

P

P

PP

P12

23

34

45

8991

P68

P44

P

P66

P77

74

P56

55

67

33P

P22

P

PP

0 0 0 0 0 0 0

0 0 0 0 0 0 0

11

0

0 0 0 0 0 0

000000

0 0 0 0 0

0000

000

00

00 0

00 0

0

00

0

1

0 0 0 0 000

0

0

0

0

0

1/61/2

0

1/2 1/2

1/2 1/2

1/2 1/2

1/2 1/2

1/2 1/20

1/3

(b) Markov Model

Figure 9: User-Browser-Server Non-Linear Model� the eventGet? from a newUserobject can come when the current system configuration is in any oneof its states in which theBrowserobject can receive it, (i.e;Get? is not time constrained); that isM1for the newUser-Browser-Serverinteraction is independent of[M(j�1)℄kj�1 , and� when a new user joins the system there will be a three-fold increase in the size of the transition matrix

For the non-linear system in Figure 5(c), assume that users join the system,one at a time, at times0, 2, 4, 5.So,k1 = 2; k2 = 2; k3 = 1. The transition matrix of the system at different time points are shown below:

At time 0 or 1: (1 user):M1At time 2: (2 users):M(2) = M21 LM1At time 4: (3 users):M(3) = [M(2)℄2LM1At time 5: (4 users):M(4) = [M(3)℄LM1

Let us consider the general case whenr > 1 users simultaneously join the system, say when there arej � 1users in the system. It is easy to see that the transition matrix for the new configuration withr+ j � 1 usersisM(r+j�1) = [M(j�1)℄kj�1LM (r)1 ; whereM (r)1 is the direct productM1LM1L : : : M1, takenr times.Whenr users leave the system, the transition matrix is computed asfollows: Let there bej(� 2) users inthe system whenr(1 < r � j) users leave. Ifr = j, then the transition matrix is not defined. Ifr < j,there arej � r users left in the system. Ifd is the interval of time that elapsed between the latest time whenthere werej � r users in the system and the current instant, then the new transition probability matrix is

15

[M(j�r)℄d. The rationale is that the transition probability matrixM(j�r) for j � r users have evolved overdtime steps.

5 Reliability Measures

The reliability prediction for a system configuration composed fromk reactive objects is defined as the levelof certainty quantified by the sourceex ess� entropy:Reliability(Subsystem) = kXi=1 Hi � HwhereH = �Pi viPj pij log pij is a level of uncertainty of the Markov system correspondingto asubsystem;vi is asteady state distribution vectorfor the corresponding Markov system and thepij valuesare the transition probabilities.Hi is a level of uncertainty in a Markov system corresponding toa reactiveobject. For a transition matrixP the steady state distribution vectorv satisfies the propertyvP = v. Thelevel of uncertaintyH is related exponentially to the number of paths that are ”statistically typical” of theMarkov system. Thus, higher entropy value implies that moresequences must be generated in order toaccurately describe the asymptotic behavior of the Markov system. We illustrate the calculation of ourreliability measure on two configurations of the case study shown in Figure 8 and 9.Reliability(Figure 8) = HUser +HServer +HBrowser � HFigure 8whereHUser = HServer = HFigure 8 = 0. For calculatingHBrowser we will need the the steady vectorof the Browser: vBrowser = f:125; :25; :5; :125g. Then,HBrowser = :25 + :15 = 0:4. Therefore,Reliability(Figure 8) = 0:4.

We calculate the reliability for Figure 9 at time stepk = 0:Reliability(Figure 9) = HUser + HServer + HBrowser � HFigure 9, whereHUser = HServer =0; HBrowser = 0:4, andHFigure 9= (v2 + v3 + v4 + v5 + v7)� log 12 + v6� (34 log 34 + 14 log 14) > 0.

Therefore,Reliability(Figure 8) > Reliability(Figure 9). The above measurement data collected ontwo different configurations for the case study given above,tests the consistency of the reliability measures.

The reliability prediction for a system is defined as the least reliability measure value among itsm subsys-tems:Reliability(System) = minfReliability(Subsystemi)gmiWe chose the minimum value due to the safety-critical character of the real-time reactive systems. Highervalue of reliability measure implies less uncertainty present in the model, and thus higher level of softwarereliability.

The Markov model of a configured system changes when the system undergoes change. The calculationof the Markov matrix for the reconfigured system would allow to compare the systems based on reliabilityprediction. If the system configurationCj�1 changes to the configurationCj , we need to calculate thereliability of the configurationCj and compare it with the reliability of the configurationCj�1:Reliability(Cj�1) = minfReliability(Si)gmi ;

16

whereSi is a subsystem ofCj�1, andReliability(Cj) = minfReliability(S0i)gmi ;whereS0i is a subsystem ofCj. If Reliability(Cj) � Reliability(Cj�1), then the uncertainty present in thereconfigured system is less than the uncertainty that existed in the current system. The reliability measure-ment will allow the reconfigured system to be deployed. However, ifReliability(Cj) < Reliability(Cj�1),then there is more uncertainty present in the reconfiguration. This would suggest to determine the subsys-tem(s) ofCj that are responsible for lowering the overall reliability.

6 Conclusions and Research Directions

The main result of this paper is a formal approach to calculate the reliability of a time-dependent Webapplication. The Web model discussed in the paper is simple,yet representative of the different Web layers.The model can be generalized to include more Web components:� a browser linked to several servers,� users interacting withAgents, who in turn interact with browsers/servers, and� servers protected by firewalls, and hence a model of firewall will have to be included as well.

In a practical setting, the number of Web components and their interactions will be large. There are also otherfactors such as resource constraints, load factor, and communication complexity. From a reliability point ofview, we require a good formal model which takes these factors into account. In the formal model proposedin this paper the load factor and communication delays can bebrought in as synchronization constraints, andresources can be modeled within each class (such as the Set inBrowser class) and timing constraints may beimposed on database transactions. Calculation of transition probabilities for large evolving configurationsinvolves multiplying fairly large matrices. The density ofthe transition probability matrix of a systemdepends on the number of transitions in the product matrix, which due to synchronization constraints, mightbe sparse. The sparsity of the matrix and the availability ofvery fast powering and multiplication algorithmsfor matrices may be used to speed up reliability calculationfor changing configurations.

One of our goals is to empirically evaluate the reliability model. This is one aspect of our ongoing studyin metrics and measurements for real-time reactive systems.

References

[AAM98] V.S. Alagar, R. Achuthan, D. Muthiayen.TROMLAB: A Software Development Environment forReal-Time Reactive Systems.Technical Report, (first version 1996, revised 1998), Concordia Univer-sity, Montreal, Canada.

[IEEE90] IEEE Standard Glossary of Software Engineering Terminology. IEEE Std 610.12.1990.

[KT2001] Chaitanya Kallepalli, Jeff Tian.Measuring and Modeling Usage and Reliability for StatisticalWeb Testing.IEEE Transactions on Software Engineering, Nov. 2001 (Vol.27, No.11), pp.1023–1036.

[O2002] Olga Ormandjieva.Quality Measurement for Real-Time Reactive SystemsPh.D thesis, Departmentof Computer Science, Concordia University, Montreal, Canada, January 2002.

17