An Efficient Method for Architecture-Based Reliability ...

An Efficient Method for Architecture-based Reliability Evaluation for EvolvingSystems with Changing Parameters

Indika Meedeniya and Lars Grunske

Faculty of ICT, Swinburne University of TechnologyHawthorn, VIC 3122, Australia

{imeedeniya,lgrunske}@swin.edu.au

Abstract—Probabilistic models are widely used inArchitecture-based reliability prediction in software intensivesystems. However, for most of the cases, it is computationallyexpensive to compute the reliability metrics and re-computethem once the system has evolved or is used in a differentenvironment. In this paper, we introduce an efficientcomputation method for Discrete Time Markov Chain basedabstractions, which computes reliability metrics once, andwe provide an incremental technique to recompute thesemetrics in case of a single change in the reliability evaluationmodel. As a result, fast an efficient reliability computationcan be provided for scenarios like design-time architectureoptimization and run time adaptation. An experimentalvalidation of the new method shows a significant improvementin terms of computation time required to re-evaluate anevolved architecture.

I. INTRODUCTION

Architecture-based reliability predictions are important to

make informed decisions in the early stages of software

development projects. One of the key motivations is that,

the use of reliability predictions gives systems and software

architects quantitative rationales to decide between design

alternatives [23]. This provides a basis for the architects

to use design space exploration mechanisms to find better

architectures. In addition, reliability prediction also helps

project managers in planning project activities such as risk

mitigation, resource allocation in testing phase, and effort

estimation.

As current service-based and adaptive systems do not have

a static architecture anymore, runtime reliability evaluations

for these evolving architectures is becoming more important.

To support these run-time evaluations reliability models are

needed to be “kept alive at runtime” (KAMI) [13] to allow

for an evaluation of the system reliability at runtime.

In software reliability prediction, Discrete Time Markov

Chain (DTMC) based abstractions are one of the prevalent

formalizations [15], [21] which are originally proposed by

L.R.Cheung [8]. Wang et al. [45] have demonstrated the

use of Cheung’s model for different styles of software

architectures while Gokhale et al. [16] have used the DTMC

based mathematical formulation as the basis for their de-

velopment of hierarchical reliability evaluations. Goseva-

Popstojonava et al. [21] illustrated extensive use of the model

in their reliability evaluation survey, and they have also

elaborated the applicability of the model with the uncertainty

analysis using “methods of moments” [20] and Monte-Carlo

simulation [19]. Significant applicability of Cheung’s model

in addressing persistent challenges in architecture evaluation

is also evident by many recent work on reliability and

performance prediction of component based systems [5], [7],

[10], [17], [32], [40], [42].

Due to the utilization of complex matrix operations in-

cluding matrix inversions which have complexity of O(n3)in DTMC based evaluations, the time required to evalu-

ate an architecture increases significantly the bigger the

architectural model gets. Additionally, for each change of

the architecture, the reliability evaluation model has to be

completely evaluated again. In this paper, we aim to reduce

the time needed for the repeated evaluations. Our approach

is to use the results of previous computations and compute

the impact of a given change as opposed to complete re-

evaluation. We call the technique Delta(Δ) evaluation. The

main contribution presented in this paper is the formal

derivation of simplified re-computation for Cheung’s model

with a single parameter change by applying the principles

and appropriate formalizations in linear algebra.

The application of the described technique has benefits

in a series of scenarios. As an example, the technique has

a positive impact in automatic designs space exploration as

described in [1], [2], [9], [22], [29], [36], [37], [38], [39].

In these approaches, a large number of candidate architec-

tures are generated by an optimization algorithm and these

candidate architectures require to be evaluated in order to be

compared. Since these candidate architectures are generated

by applying small changes, our Δ evaluation technique is

applicable and enables searching in larger design spaces.

Another key motivation for the Δ evaluation is that it

can be used to cater the excessive computations in the

uncertainty analysis. Gosva-Popstojanova et al. [18] have

investigated analytical-and simulation-based method for un-

certainty analysis, and have confirmed that Monte Carlo

simulation based methods scales better than the analysis of

method of moments. One major drawback of the Monte Carlo

simulation based uncertainty analysis is that, it requires to

2010 21st International Symposium on Software Reliability Engineering

1071-9458/10 $26.00 © 2010 IEEE

DOI 10.1109/ISSRE.2010.19

229

2010 IEEE 21st International Symposium on Software Reliability Engineering

1071-9458/10 $26.00 © 2010 IEEE

DOI 10.1109/ISSRE.2010.19

229

create large number of sample variations of an architecture

from the probability distributions of parameters, which is

computationally expensive [6], [18], [35]. The reliability

models need to be re-evaluated for each variant. The Δevaluation technique presented in this paper contributes to

address this problem, by significantly reducing the compu-

tation overhead in re-evaluation.

The technique can also be applied to reduce the compu-

tational overhead in runtime architecture evaluation because

a complete evaluation is often impractical due to limited

resources. Such runtime architecture evaluations are required

because monitoring shows that a component reliability or the

operational profile has been changed [13], [25] or a relia-

bility trend has been detected [30], [31]. To avoid complete

evaluation at runtime, results of design time evaluations need

to be stored with the running system, and our Δ evaluation

can then be applied efficiently in an event of an architectural

change.

In summary, in this paper, we derive a fast and efficient

method to recompute reliability metrics for architecture

specifications using Cheung’s model. We validate the ef-

ficiency with a series of experiments, based on the required

CPU cycles for the computation with growing model sizes.

The rest of the paper is organized as follows: Section II

introduces the reader to the concepts of reliability evaluation

with the Cheung’s Model and recalls some relevant matrix

operations. Based on these concepts, a novel incremental

reliability evaluation technique is introduced in Section III.

The results of an experimental validation of the developed

technique are presented in Section IV. Section V compares

this approach to related work. Finally, Section VI concludes

the paper and highlights directions for future research.

II. PRELIMINARIES

A. Cheung’s Model of Reliability Prediction

In this section we will recall the basic concepts of

Cheung’s reliability prediction model [8] and provide a brief

definition of DTMCs.

Discrete Time Markov Chain (DTMC): A DTMC is a

tuple (S, P ) where S is a finite set of states and P : S×S →[0, 1] is the transition probability matrix. A DTMC is called

absorbing when at least one of its states has no outgoing

transition [43].

Architecture: The program flow graph of a terminating

application has a single entry and a single exit node. A

terminating application is an application that operates on

demand, and a single run of software that corresponds to a

terminating execution can be clearly identified. This model

can easily be extended to support multiple initial nodes and

multiple final states by introducing super-initial, super-finalstates [45]. The transfer of control among modules can be

described by an absorbing DTMC with transition probability

Figure 1: Control flow graph and corresponding DTMC for

Cheung’s model.

matrix P = [pij ], where pij denotes the probability of jth

module is called after executing the ith module.

Failure behavior: An assumption of Cheung’s model

is that the components fail independently and the reliability

of the component i is characterized by the probability Ri

that the component performs its function correctly, i.e., the

component produces the correct output and transfers control

to the next component without a failure.

Reliability evaluation: Two absorbing states C and Fare added, representing the correct output and failure, respec-

tively, and the transition probability matrix P is modified

accordingly to P . The original transition probability pij

between the components i and j is modified into Ripij ,

which represents the probability that the module i produces

the correct result and the control is transferred to component

j. From the final (exit) state n, a directed edge to state

C is created with transition probability Rn to represent

the correct execution. The failures of a component i are

considered by creating a directed edge to failure state F with

transition probability (1 − Ri). This process integrates the

failure behavior of the components to the functional behavior

described in the original control flow. Thus, a DTMC defined

with transition probability matrix P is considered as a

composite model of the software system [16]. Figure 1

illustrates a control flow graph of a software system and

the corresponding DTMC when the two states C and F are

added. The reliability of the program is the probability of

reaching the absorbing state C of the DTMC.

Let Q be the matrix obtained from P by deleting rows

and columns corresponding to the absorbing states C and

F . Q is called the generator matrix of the DTMC. Qk(1,n)

represents the probability of reaching state n from 1 through

k transitions. From initial state 1 to final state n, the number

of transitions k may vary from 0 to infinity.

230230

It can be proved that the infinite summation converges as

follows [8]:

S = I + Q + Q2 + Q3 + ... =∞∑

k=0

Qk = (I −Q)−1

The matrix S is called the fundamental matrix of the

DTMC, and S(i,j) represents the expected number of visits

to the state j starting from state i before it is absorbed.

Cheung [8] introduced an architecture based reliability esti-

mation method in which the reliability of the overall system

can be computed from S as;

Rs = S(1,n)Rn

Example � For the DTMC given in Figure 1, the modified

transition probability matrix is:

P =

⎡⎢⎢⎢⎢⎢⎢⎣

c1 c2 c3 F Cc1 0 R1p1,2 0 (1−R1) 0c2 0 0 R2p2,3 (1−R2) 0c3 0 0 0 (1−R3) R3

F 0 0 0 0 0C 0 0 0 0 0

⎤⎥⎥⎥⎥⎥⎥⎦

By deleting rows and columns of P that corresponds to the

nodes C and F ,

Q =

⎡⎣ 0 R1p1,2 0

0 0 R2p2,3

0 0 0

⎤⎦

Let R1 = 0.8, R2 = 0.9, R3 = 0.85, p1,2 = 1 and p2,3 = 1.

Then,

Q =

⎡⎣ 0 0.8 0

0 0 0.90 0 0

⎤⎦ , and

S = (I −Q)−1 =

⎡⎣ 1 0.8 0.72

0 1 0.90 0 1

⎤⎦

As a result, the system reliability Rs = S(1,3)R3 = 0.72×0.85. �

B. Basic Matrix Operations

In this section we describe a few fundamental matrix

operations, that are required for the understanding of the

rest of the paper.

Minors: Given an n× n matrix X = (xij), the minor

of the element xij is the (n−1)×(n−1) matrix that results

when the ith row and the jth column of X are deleted [11].

The minor of the element xij will be denoted as Mij .

Example � If we define X as:

X :=

⎡⎣ x1,1 x1,2 x1,3

x2,1 x2,2 x2,3

x3,1 x3,2 x3,3

⎤⎦ (1)

then the minors are defined as follows:

minor of element x1,1 = M1,1 =[

x2,2 x2,3

x3,2 x3,3

]

minor of element x2,1 = M2,1 =[

x1,2 x1,3

x3,2 x3,3

]

�

Determinant: The determinant is a special number

associated to any square matrix, i.e. a matrix with the same

number of rows and columns. The determinant of a matrix

X , is denoted det(X) or |X|.Definition 1: Determinant of a 1 × 1 matrix is the

element itself.

Definition 2: Determinant of any n× n square matrix Xis recursively defined as,

|X| =n∑

i=1

{(−1)i+1 × x1,i|M1,i|

}

where |M1,i| denotes the determinant of the minor of

element x1,i.

Example � For a 2× 2 matrix Y :=∣∣∣∣ a b

c d

∣∣∣∣,

|Y | = (−1)2 × a|d|+ (−1)3 × b|c|= ad− bc

For the 3× 3 matrix X introduced in (1),

|X| =∣∣∣∣∣∣

x1,1 x1,2 x1,3

x2,1 x2,2 x2,3

x3,1 x3,2 x3,3

∣∣∣∣∣∣= x1,1

∣∣∣∣ x2,2 x2,3

x3,2 x3,3

∣∣∣∣− x1,2

∣∣∣∣ x2,1 x2,3

x3,1 x3,3

∣∣∣∣ +

x1,3

∣∣∣∣ x2,1 x2,2

x3,1 x3,2

∣∣∣∣= x1,1x2,2x3,3 − x1,1x2,3x3,2 − x1,2x2,1x3,3+

x1,2x2,3x3,1 + x1,3x2,1x3,2 − x1,3x2,2x3,1

�

Cofactor elements: Given a n × n matrix X = (xij),the cofactor, Cij of the element xij is defined by,

Cij = (−1)i+j |Mij |Example � For X introduced in (1)

C1,1 = (−1)(1+1)|M1,1| =∣∣∣∣ x2,2 x2,3

x3,2 x3,3

∣∣∣∣C2,1 = (−1)(2+1)|M2,1| = (−1)×

∣∣∣∣ x1,2 x1,3

x3,2 x3,3

∣∣∣∣ �

231231

Transpose: The transpose of a matrix X is a matrix

formed from X by interchanging the rows and columns

such that row i of matrix X becomes column i of the

transposed matrix. The transpose of X is denoted by XT .

Example � For X introduced in (1)

XT =

⎡⎣ x1,1 x2,1 x3,1

x1,2 x2,2 x3,2

x1,3 x2,3 x3,3

⎤⎦

�

Matrix inversion: The matrix X has an inverse if and

only if |X| �= 0. When |X| �= 0, the inverse is given

explicitly by,

X−1 =CT

|X| (2)

where CT represents the transpose of the matrix of cofactors

of X( i.e. C(i,j) is cofactor of xij).

Therefore,

X−1ij =

Cji

|X| =(−1)(j+i)|Mj,i|

|X| (3)

Example � For X introduced in (1),

X−11,2 = (−1)(1+2)|M2,1|

|X| =(−1)×

˛˛˛

x1,2 x1,3

x3,2 x3,3

˛˛˛

˛˛˛˛

x1,1 x1,2 x1,3

x2,1 x2,2 x2,3

x3,1 x3,2 x3,3

˛˛˛˛

�

Sylvester’s determinant theorem [27]: This theorem

states that for A, an m×n matrix, and B, an n×m matrix,

det(Im + AB) = det(In + BA)

where Im and In are the m×m and n×n identity matrices,

respectively. If we assume a column vector c and row vector

r, each with m components, the formula allows the quick

calculation of the determinant of a matrix that differs from

the identity matrix by a matrix of rank 1: |Im+cr| = 1+rc.

More generally, for any invertible m×m matrix X,

|X + cr| = |X|(1 + rX−1c) (4)

Example � For X introduced in (1), let us add

ΔX :=

⎡⎣ 0 0 δ

0 0 00 0 γ

⎤⎦ such that X ′ = X + ΔX .

Note that x′1,3 = x1,3 + δ and x′3,3 = x3,3 + γ.

The ΔX can be expressed in cr format(multiplication of

column and raw matrices),

ΔX =

⎡⎣ 0 0 δ

0 0 00 0 γ

⎤⎦ =

⎡⎣ δ

0γ

⎤⎦× [

0 0 1]

For the clarity in derivation, let us define Y = X−1 as,

Y :=

⎡⎣ y1,1 y1,2 y1,3

y2,1 y2,2 y2,3

y3,1 y3,2 y3,3

⎤⎦

Then, the determinant of the new matrix X ′ can be derivedas follows,

|X′| = |X + ΔX|

=

˛˛˛

24

x1,1 x1,2 x1,3x2,1 x2,2 x2,3x3,1 x3,2 x3,3

35 +

24

0 0 δ0 0 00 0 γ

35

˛˛˛

= |X|0@1 + [0 0 1]

24

y1,1 y1,2 y1,3y2,1 y2,2 y2,3y3,1 y3,2 y3,3

35

24

δ0γ

35

1A [From (4)]

= |X|(1 + (δ(y1,1 + y3,1) + γ(y1,3 + y3,3)) [Matrix

multiplication]

�

III. INCREMENTAL EVALUATION

In this section we derive a Δ evaluation technique to

be applied to Cheung’s reliability evaluation model. We

assume an initial DTMC model of an architecture and apply

a change to the model. For this change, we illustrate the

formulation of an efficient computation method to determine

reliability metrics of an architecture specification.

Let Q be the generator matrix of the DTMC. As described

in Section II-A, the system reliability is Rs = S1,nRn,

where S = (I −Q)−1. Let,

A := I −Q (5)

and, Cij denotes Cofactor element of aij .

From (2) and (3) it follows that,

S = (I −Q)−1 =CT

|I −Q| =CT

|A| (6)

S1,n =CT

1,n

|A| =Cn,1

|A| =(−1)n+1|Mn,1|

|A| (7)

where Mn,1 represents the minor element of an,1.

Let A =

⎡⎢⎢⎢⎢⎣

a11 a12 . . a1n

a21 a22 . . a2n

. . . . .a(n−1)1 a(n−1)2 . . a(n−1)n

an1 an2 . . ann

⎤⎥⎥⎥⎥⎦ ,

and B =

⎡⎢⎢⎣

a12 . . a1n

a22 . . a2n

. . . .a(n−1)2 . . a(n−1)n

⎤⎥⎥⎦ .

Note that B has been obtained by removing the nth row

and the 1st column from A. From the definition of minor

elements described in subsection II-B, it can be identified

232232

that B is equal to the minor element Mn,1 of A. Therefore,

|Mn,1| = |B|. As a result, equation (7) can be expressed as,

S1,n =(−1)n+1|B|

|A| (8)

Suppose the transitions of the system has been changed

and the change is expressed by ΔQ such that the new matrix

Q′ is Q′ = Q + ΔQ. The new system reliability is R′s =S′1,nRn, where S′ = (I −Q′)−1.

The change matrix ΔQ can be expressed as a row vector rand a column vector c such that ΔQ = cr.

A′ = A + ΔA = A + c′r′ where c′r′ = I − crSimilarly, B′ = B + c′′r′′ where c′′, r′′ are obtained by

removing the nth row and the 1st column from c′r′.From (4) it follows that,

|A′| = |A + c′r′| = |A|(1 + r′A−1c′) (9)

|B′| = |B + c′′r′′| = |B|(1 + r′′B−1c′′) (10)

From (8) we derive,

S′1,n =(−1)n+1|B′|

|A′| =(−1)n+1|B|(1 + r′′B−1c′′)

|A|(1 + r′A−1c′)(11)

= S1,n1 + r′′B−1c′′

1 + r′A−1c′(12)

Example � Let Qn×n be the generator matrix of a DTMC

model when absorbing states are removed, and A is defined

as in (5). Suppose one element(q2,3) of Q is modified.For

example, if the reliability of a component that has a single

outgoing transition has been changed, two elements in the

transition matrix will be updated. Because of the absorbing

states being removed in obtaining Q, only one element in Qwill be changed. According to (5), the corresponding change

in A is a′2,3 = a2,3 − δ.

With our matrix notation of A′ = A + ΔA, ΔA can be

defined as,

ΔA =

⎡⎢⎢⎢⎢⎣

0 0 . . 00 0 −δ . 0. . . . .0 0 . . 00 0 . . 0

⎤⎥⎥⎥⎥⎦

This can be expressed as a multiplication of row and

column vectors (i.e. ΔA = c′r′), and c′, r′ can be obtained

as follows.

c′ =

⎡⎢⎢⎢⎢⎢⎢⎣

0−δ0..0

⎤⎥⎥⎥⎥⎥⎥⎦

n×1

r′ =[

0 0 1 . . 0]1×n

Note that c′ is the column 3 of ΔA and r′ has been obtained

by replacing −δ with 1 in 2nd row of the matrix in order to

obtain ΔA after multiplication. Then,

r′A−1c′ = A−13,2 ×−δ (13)

In (13), it should be noted that all other elements in A−1

apart from A−13,2 are multiplied with 0.

Similarly, by removing elements corresponding to 1st

column and nth row of ΔA, the corresponding change

vectors for B (i.e. B′ = B + c′′r′′) can be determined,

c′′ =

⎡⎢⎢⎢⎢⎣

0−δ0.0

⎤⎥⎥⎥⎥⎦

n−1×1

r′′ =[

0 0 1 . 0]1×n−1

Then, similar to (13)

r′′B−1c′′ = B−12,2 ×−δ (14)

From (13), (14) and (12) it follows that,

S′1,n = S1,n

1 + B−12,2 ×−δ

1 + A−13,2 ×−δ

(15)

�

The results obtained in above derivation can be general-

ized to any single element modification (δij) in the transition

matrix Q,

S′(1,n) = S(1,n)

1− δij ×B−1(j−1,i)

1− δij ×A−1(j,i)

(16)

where, A = (I −Q) and B is obtained by deleting the nth

row and the 1st column from A. It should be noted that A−1

and B−1 are the inverses obtained from the original matrix Airrespective of the change δij . The conventional computation

of S′(1,n) requires the inversion of the new transition matrix

(i.e. (I−Q′)−1) and comparatively, the final result obtained

in (16) has simplified the computational requirements to a

few basic arithmetic operations.

Example � Let us reuse the example depicted in Figure 1

and parameters R1 = 0.8, R2 = 0.9, R3 = 0.85, p1,2 =1, p2,3 = 1. Then,

Q =

⎡⎣ 0 0.8 0

0 0 0.90 0 0

⎤⎦ ,

A = (I −Q) =

⎡⎣ 1 −0.8 0

0 1 −0.90 0 1

⎤⎦ , and

B =[ −0.8 0

1 −0.9

].

233233

The corresponding inverse matrices will be as follows,

A−1 =

⎡⎣ 1 0.8 0.72

0 1 0.90 0 1

⎤⎦ , and

B−1 =[ −1.250 0−1.389 −1.111

].

For the initial setting, S(1,3) = A−1(1,3) = 0.72. Now,

let us change an element of the original transition matrix

Q. Assume Q(1,2) has been changed from 0.8 to 0.85. i.e.

δ1,2 = 0.05. Then from (16),

S′(1,3) = S(1,3)

1− δ ×B−1(2−1,i)

1− δ ×A−1(2,1)

= 0.72× 1− (0.05×−1.25)1− (0.05× 0)

= 0.765

Note that the value is identical to the result of (I−Q′)−1(1,3)

with the new generator matrix:

Q′ =

⎡⎣ 0 0.85 0

0 0 0.90 0 0

⎤⎦

i.e.,

(I −Q′)−1 =

⎡⎣ 0 0.85 0.765

0 0 0.90 0 0

⎤⎦

�

In summary, we have derived that the critical element of

system reliability calculation i.e. S′(1, n) can be obtained

from solving the simplified formula in (16) instead of

performing the conventional computation of (I − Q′)−1,

which has the complexity of O(n3). The derivation pre-

sented in this paper is limited to single element changes

in the DTMC model. With this limitation, the formulation

is directly applicable to reliability evaluations with respect

to component reliability changes that only affect a single

element in the generator matrix. The derivation can be

extended to more general cases of multiple element changes

in a row by defining the ΔQ = cr as follows: r corresponds

to the row vector of changes while c is a column vector that

contains zeros except the one that points to the relevant row.

A complete derivation of general case is not presented in this

paper, and remains for future work. In the following section,

we experimentally validate the significance of this gain in

reliability evaluation of practical sized models.

IV. EXPERIMENTAL RESULTS AND DISCUSSION

A series of reliability evaluation test cases have been

performed with different problem sizes. For each case, a

transition matrix is generated to satisfy the requirements

of an absorbing DTMC. Then a random position of the

transition matrix is modified, and the corresponding change

of the system reliability (i.e. S(1,n)Rn) is computed using

the conventional computation (i.e. by (I − Q)−1) and the

presented incremental evaluation approach. The size of the

transition matrix is varied from 20 to 1000, and 100 ex-

periments have carried out for each size. The experiments

have been implemented in Java. The CPU cycles required

for each computation are recorded for comparison, as a more

processor independent measure as opposed to execution

times1.

0×100

1×109

2×109

3×109

4×109

5×109

0 200 400 600 800 1000

CP

U c

ycle

s

Size of the Matrix (n)

(a) Conventional Evaluation

0

1000

2000

3000

4000

5000

6000

0 200 400 600 800 1000

CP

U c

ycle

s

Size of the Matrix (n)

(b) Incremental Evaluation

Figure 2: Growth of CPU demand with the problem size

Figure 2 illustrates the comparative growth of CPU cy-

cles with respect to the size of the problem for the two

methods of computation. It should be highlighted that the

Y axis scale in Figure 2a expands up to 5 × 109 while

the corresponding values in Figure 2b are less than 6000.

In addition to the significantly lower CPU demand in the

new approach compared to the conventional computation,

it can be seen that the conventional reliability computation

exhibits exponential growth while the processing required in

the incremental method shows relatively lower growth with

the problem size.

Detailed statistical indexes of the two methods and the

gain achieved by using the Δ evaluation method are pre-

sented in Table I. The quartiles being very close to the

average and small standard deviations of the results confirm

that the gain is stable and promising. It can be also seen that

when the size of the model is getting bigger, the new method

provides a significant computational advantage over the

1Complete code and experimental data are available athttp://www.ict.swin.edu.au/personal/imeedeniya/experiments/deltacheung/

234234

Tab

leI:

Sta

tist

ical

index

esof

exper

imen

tal

resu

lts

Mat

rix

20

40

60

80

10

02

00

40

06

00

80

01

00

0S

ize

(n)

ConventionalEvaluation

Min

52,6

6023

8,78

771

0,91

41,

490,

762

2.92×

106

2.39×

107

1.84×

108

6.23×

108

2.08×

109

4.79×

109

1stQ

uart

ile

54,3

1925

3,06

972

8,79

41,

583,

878

2.95×

106

2.41×

107

1.85×

108

6.32×

108

2.09×

109

4.82×

109

Aver

age

55,2

4927

3,27

574

8,73

41,

613,

477

3.00×

106

2.45×

107

1.89×

108

6.52×

108

2.12×

109

4.85×

109

3rdQ

uart

ile

55,1

9230

1,88

974

2,97

11,

611,

797

2.99×

106

2.43×

107

1.93×

108

6.61×

108

2.15×

109

4.87×

109

Max

71,2

3955

1,81

61,

254,

210

2,33

7,02

93.

85×

106

2.74×

107

2.14×

108

7.49×

108

2.22×

109

4.95×

109

St.

Dev

2,66

341

,685

63,7

1692

,448

1.21×

105

8.76×

105

6.89×

106

2.92×

107

4.06×

107

3.82×

107

ΔEvaluation

Min

1,74

61,

746

1,81

61,

537

1,81

63,

003

4,05

14,

400

4,40

04,

260

1stQ

uart

ile

1,81

61,

816

1,88

51,

886

2,09

53,

632

4,40

04,

679

4,73

24,

679

Aver

age

1,86

81,

916

1,93

92,

023

2,22

73,

929

4,53

84,

885

4,85

24,

866

3rdQ

uart

ile

1,88

61,

956

1,97

32,

095

2,32

24,

190

4,67

94,

959

4,97

65,

029

Max

2,37

42,

374

2,37

53,

073

3,56

24,

959

5,51

810

,197

5,51

85,

797

St.

Dev

7811

010

820

528

539

723

060

222

325

8

Gain(CPUcycles)

Min

50,7

0423

6,97

270

9,02

81,

488,

947

2.91×

106

2.39×

107

1.84×

108

6.23×

108

2.08×

109

4.79×

109

1stQ

uart

ile

52,4

5125

1,23

672

6,83

81,

581,

939

2.95×

106

2.41×

107

1.85×

108

6.32×

108

2.09×

109

4.82×

109

Aver

age

53,3

8027

1,35

874

6,79

51,

611,

455

3.00×

106

2.45×

107

1.89×

108

6.52×

108

2.12×

109

4.85×

109

3rdQ

uart

ile

53,3

0730

0,05

574

1,05

01,

609,

877

2.99×

106

2.43×

107

1.93×

108

6.61×

108

2.15×

109

4.87×

109

Max

69,3

5354

9,65

01,

251,

905

2,33

4,58

43.

85×

106

2.74×

107

2.14×

108

7.49×

108

2.22×

109

4.95×

109

St.

Dev

2,65

841

,657

63,6

6392

,365

1.21×

105

8.76×

105

6.89×

106

2.92×

107

4.06×

107

3.82×

107

Floating

Pointerr.

Min

−1.3

3×

10−

15−8

.88×

10−

16−1

.78×

10−

15−1

.33×

10−

15−1

.22×

10−

15−1

.11×

10−

15−3

.11×

10−

15−2

.22×

10−

15−4

.22×

10−

15−3

.77×

10−

15

1stQ

uart

ile−2

.22×

10−

16−2

.22×

10−

16−4

.44×

10−

16−2

.22×

10−

16−2

.22×

10−

16−3

.33×

10−

16−6

.66×

10−

16−4

.44×

10−

16−1

.22×

10−

15−1

.11×

10−

15

Aver

age

−1.3

3×

10−

17

4.33×

10−

17−8

.55×

10−

17

3.77×

10−

17

3.44×

10−

17

1.37×

10−

16−1

.03×

10−

16

1.48×

10−

16−2

.33×

10−

16−1

.63×

10−

16

3rdQ

uart

ile

2.22×

10−

16

3.33×

10−

16

2.22×

10−

16

3.33×

10−

16

3.33×

10−

16

5.55×

10−

16

4.44×

10−

16

8.05×

10−

16

1.03×

10−

15

6.66×

10−

16

Max

6.66×

10−

16

1.11×

10−

15

1.55×

10−

15

1.55×

10−

15

1.78×

10−

15

2.00×

10−

15

3.11×

10−

15

2.89×

10−

15

2.44×

10−

15

4.00×

10−

15

St.

Dev

3.64×

10−

16

3.74×

10−

16

5.41×

10−

16

5.52×

10−

16

5.38×

10−

16

6.13×

10−

16

1.07×

10−

15

1.07×

10−

15

1.53×

10−

15

1.58×

10−

15

235235

conventional method. The final section in the table indicates

the statistics of difference in the values obtained by the two

computation methods, as a proof of accuracy of the method.

It should be noted that the difference of maximum 10−15 is

due to the rounding of floating points in the computation.

Since the conventional evaluation with matrix inversion has

more floating point operations, we believe that our results

have less floating point errors.

V. RELATED WORK

Reliability evaluation based on software architectures is an

active research area and several different models have been

developed over the past decades. Comprehensive surveys

on the existing approaches can be found in [15], [21],

[28]. The research in this paper complements these existing

reliability evaluation approaches by providing an efficient

mean to re-evaluate an architecture in the event of parameter

changes of the model or occurence of re-configurations. The

presented method is based on Cheung’s model [8], and can

also be used similar to his approach on sensitivity analysis.

However, the use of first order partial derivative of system

reliability with respect to component reliability (i.e. ∂R∂Ri

)

only provides an indication of how the system reliability

would change for a small change in component reliability

from its original estimation. This method of sensitivity

calculation lacks power in obtaining the bounds of system

reliability considering an allowable variance in component

reliability which is a common practical requirement. Goseva-

Popstojanova et al. [20] presented an analytical method

called method of moments which address the above issue by

extending the analysis with variance. Cortellessa et al. [10]

has applied the sensitivity analysis considering the effects

of error propagation among components. Our work comple-

ments the above works by enabling the variance analysis by

means of efficient computation with respect to an analytical

solution of traditional approaches which is computationally

expensive for larger systems. We presented a Δ evaluation

technique which efficiently calculates an impact of a change.

As a result, it permits parameter sweeps at significantly

lower cost and enables bottleneck analysis to identify most

sensitive components to the system reliability. To the best

of our knowledge, the specific application of linear algebra

as presented in this paper for efficient reliability calculation

in software architecture domain is novel.

Currently most architecture based reliability evaluation

models are based on Cheung’s models and consequently

our research also complements and supports these more

advanced models. As an example, Wang et al. [44], [45] pre-

sented the use of Cheung’s model for reliability evaluation in

different architectural styles, which can directly benefit from

our approach. Gokhale et al. [16] have used the model as the

basis for their extension to hierarchical reliability evaluation.

Goseva-Popstojonava et al. [21] illustrated extensive use of

the model in their reliability evaluation survey, and have

also elaborated the applicability of the model with the

extensions of “methods of moments” [20]. A scenario based

extension of the Cheung’s model has been presented by

Rodrigues et al. [41] which enables the construction of high

level system model using characteristics of basic message

sequence charts. Significant applicability of Cheung’s model

in addressing persistent challenges in architecture evaluation

is also evident in many recent approaches on reliability

and also performance prediction of component-based sys-

tems [5], [7], [10], [32], [40], [42].

Outside of the reliability prediction research, approaches

on parametric probabilistic model checking [12], [26], [34]

are closely related. These approaches aim to obtain the

probability of reaching a specific state when the underlying

DTMC is parametric. Lanotte et al. [34] have used the

parametric DTMCs to check for maximum or minimum

reachability of states with respect to parameters, such as

transition probabilities. The regular expression approach of

Daws [12] to computing reachable probability has been

complemented by Hahn et al. [26] with the introduction

of partial function evaluations which are more efficient.

Our approach of Δ evaluation exhibits a synergy with the

domain of parametric model checking as it is applicable to

effectively computing reachability with respect to a change

in the Markov model as well.

VI. CONCLUSIONS AND FUTURE WORK

In this paper, we have introduced an efficient proce-

dure to re-compute architecture-based reliability metrics

with Cheung’s model when small changes in the architec-

ture are made. We presented the formal derivation of the

simplified reliability computation with respect to a given

change in the model. This procedure is called Δ evaluation,

and is based on obtaining a computational advantage by

reusing previous evaluation results. We use matrix theory

for the simplification of reliability computations, specially

Sylvester’s determinant theorem. In this paper, the derivation

is focused on single element changes in the transition matrix,

which limits the applicability to single component reliability

changes. An experimental evaluation shows that the time

needed for reliability re-evaluation is significantly reduced.

This reduction depends on the size of the underlying Markov

model. Due to the exponential grow in the conventional

reliability calculation, the comparative advantage of our

technique significantly increases when it is applied to larger

models.

In our future work we aim to provide the complete

derivation for multiple changes in a row of the transition

matrix. To extend this approach appropriate c, r vectors need

to be selected. Specifically, the row vector r has to describe

the applied changes and the column vector c need to be

filled with zeros except the one that points to the row vector

as defined in section III. This extension will improve the

236236

applicability of our approach to a broader range of change

scenarios.Furthermore, we aim to transfer the presented Δ evalua-

tion technique to other reliability and performance prediction

methods that use Discrete Time Markov Chain (DTMC)

based abstractions [7], [21], [32], [40], [42], [45]. Based

on the results presented in this paper, the benefits of an

application of the Δ evaluation technique to these pre-

diction methods is evident. Specifically, the application of

the new technique will improve time efficiency in runtime

evaluation and design space exploration. The Δ evaluation

technique could be also applied to fault tree analysis, where

there is also a tree to evaluate systems under uncertainties

[14]. A further research direction is to apply Δ evaluation

to probabilistic model checking problems [4], [33]. This

would support the current research on parametric model

checking [26], [34] and will improve the performance of

model checking experiments with variables. As result “what

if” studies such as pFMEA [3], [24] that are based on

probabilistic model checking will benefit since they require

multiple model checking runs with similar probabilistic

models.

ACKNOWLEDGEMENT

This original research was proudly supported by the Com-

monwealth of Australia, through the Cooperative Research

Center for Advanced Automotive Technology (projects C4-

501: Safe and Reliable Integration and Deployment Archi-

tectures for Automotive Software Systems).

REFERENCES

[1] A. Aleti, S. Bjornander, L. Grunske, and I. Meedeniya,“ArcheOpterix: An extendable tool for architecture optimiza-tion of AADL models,” in Model-based Methodologies forPervasive and Embedded Software (MOMPES). ACM andIEEE Digital Libraries, 2009, pp. 61–71.

[2] A. Aleti, L. Grunske, I. Meedeniya, and I. Moser, “Letthe ants deploy your software - an ACO based deploymentoptimisation strategy,” in Automated Softw. Eng.(ASE), 2009,pp. 505–509.

[3] H. Aljazzar, M. Fischer, L. Grunske, M. Kuntz, F. Leitner-Fischer, and S. Leue, “Safety analysis of an airbag systemusing probabilistic FMEA and probabilistic counterexam-ples,” in Sixth International Conference on the QuantitativeEvaluation of Systems, (QEST’09). IEEE Computer Society,2009, pp. 299–308.

[4] C. Baier, B. R. Haverkort, H. Hermanns, and J.-P. Katoen,“Model-checking algorithms for continuous-time markovchains,” IEEE Transactions on Software Engineering, vol. 29,no. 6, pp. 524–541, 2003.

[5] S. Becker, L. Grunske, R. Mirandola, and S. Overhage,“Performance prediction of component-based systems asurvey from an engineering perspective,” in ArchitectingSystems with Trustworthy Components, ser. Lecture Notes inComputer Science(LNCS), vol. 3938. Springer, 2006, pp.169–192.

[6] H.-G. Beyer and B. Sendhoff, “Robust optimization - a com-prehensive survey,” Computer Methods in Applied Mechanicsand Engineering, vol. 196, no. 33-34, pp. 3190 – 3218, 2007.

[7] F. Brosch and B. Zimmerova, “Design-Time Reliability Pre-diction for Software Systems,” in Proceedings of the Inter-national Workshop on Software Quality and Maintainability(SQM’09), 2009, pp. 70–74.

[8] R. C. Cheung, “A user-oriented software reliability model,”IEEE Transactions on Software Engineering, vol. 6, no. 2,pp. 118–125, 1980.

[9] D. W. Coit and A. E. Smith, “Reliability optimization ofseries-parallel systems using a genetic algorithm,” IEEETransactions on Reliability, vol. 45, no. 2, pp. 254–260, 1996.

[10] V. Cortellessa and V. Grassi, “A modeling approach to analyzethe impact of error propagation on reliability of component-based systems,” in Component-Based Software Engineering(CBSE’07), vol. 4608, 2007, pp. 140–156.

[11] P. J. Davis, The mathematics of matrices. Blaisdell Pub. Co.,1984.

[12] C. Daws, “Symbolic and parametric model checking ofdiscrete-time markov chains,” in Theoretical Aspects of Com-puting (ICTAC’04), ser. Lecture Notes in Computer Science,vol. 3407. Springer, 2004, pp. 280–294.

[13] I. Epifani, C. Ghezzi, R. Mirandola, and G. Tamburrelli,“Model evolution by run-time parameter adaptation,” in In-ternational Conference on Software Engineering (ICSE’09).IEEE Computer Society, 2009, pp. 111–121.

[14] M. Forster and M. Trapp, “Fault tree analysis of software-controlled component systems based on second-order proba-bilities,” in Proceedings of the 20th International Symposiumon Software Reliability Engineering,(ISSRE ’09). Washing-ton, DC, USA: IEEE Computer Society, 2009, pp. 146–154.

[15] S. S. Gokhale, “Architecture-based software reliability analy-sis: Overview and limitations,” IEEE Trans. Dependable Sec.Comput, vol. 4, no. 1, pp. 32–40, 2007.

[16] S. S. Gokhale and K. S. Trivedi, “Reliability predictionand sensitivity analysis based on software architecture,” inInternational Symposium of Software Reliability Engineering(ISSRE’02). IEEE Computer Society, 2002, pp. 64–78.

[17] K. Goseva-Popstojanova, M. Hamill, and R. Perugupalli,“Large empirical case study of architecture-based softwarereliability,” in International Symposium on Software Reliabil-ity Engineering(ISSRE’05). IEEE Computer Society, 2005,pp. 43–52.

[18] K. Goseva-Popstojanova, M. Hamill, and X. Wang, “Ade-quacy, accuracy, scalability, and uncertainty of architecture-based software reliability: Lessons learned from large em-pirical case studies,” in International Symposium on SoftwareReliability Engineering(ISSRE’06). IEEE Computer Society,2006, pp. 197–203.

[19] K. Goseva-Popstojanova and S. Kamavaram, “Assessing un-certainty in reliability of component-based software systems,”in International Symposium on Software Reliability Engineer-ing(ISSRE’03). IEEE Computer Society, 2003, pp. 307–320.

237237

[20] K. Goseva-Popstojanova and S. Kamavaram, “Software re-liability estimation under uncertainty: Generalization of themethod of moments,” in 8th IEEE International Symposiumon High-Assurance Systems Engineering (HASE 2004). IEEEComputer Society, 2004, pp. 209–218.

[21] K. Goseva-Popstojanova and K. S. Trivedi, “Architecture-based approach to reliability assessment of software systems,”Performance Evaluation, vol. 45, no. 2-3, pp. 179–204, 2001.

[22] L. Grunske, “Identifying ”good” architectural design alterna-tives with multi-objective optimization strategies,” in Inter-national Conference on Software Engineering, ICSE. ACM,2006, pp. 849–852.

[23] L. Grunske, “Early quality prediction of component-basedsystems - A generic framework,” Journal of Systems andSoftware, vol. 80, no. 5, pp. 678–686, 2007.

[24] L. Grunske, R. Colvin, and K. Winter, “Probabilistic model-checking support for FMEA,” in Fourth International Confer-ence on the Quantitative Evaluation of Systems (QEST’07).IEEE Computer Society, 2007, pp. 119–128.

[25] L. Grunske and P. Zhang, “Monitoring probabilistic proper-ties,” in Proceedings of the 7th joint meeting of the EuropeanSoftware Engineering Conference and the ACM SIGSOFTInternational Symposium on Foundations of Software Engi-neering, FSE/ESEC 2009, H. van Vliet and V. Issarny, Eds.ACM, 2009, pp. 183–192.

[26] E. M. Hahn, H. Hermanns, and L. Zhang, “Probabilisticreachability for parametric markov models,” in SPIN:ModelChecking Software, 16th International SPIN Workshop, ser.Lecture Notes in Computer Science, vol. 5578. Springer,2009, pp. 88–106.

[27] D. A. Harville, Matrix Algebra From a Statistician’s Perspec-tive. Springer, 2008.

[28] A. Immonen and E. Niemela, “Survey of reliability and avail-ability prediction methods from the viewpoint of softwarearchitecture,” Software and System Modeling, vol. 7, no. 1,pp. 49–65, 2008.

[29] S. Islam, R. Lindstrom, and N. Suri, “Dependability drivenintegration of mixed criticality SW components,” in Inter-national Symposium on Object/component/service-orientedReal-time distributed Computing, ISORC. IEEE ComputerSociety, 2006, pp. 485–495.

[30] K. Kanoun, M. Kaaniche, and J.-C. Laprie, “Qualitative andquantitative reliability assessment,” IEEE Software, vol. 14,no. 2, pp. 77–87, 1997.

[31] K. Kanoun and J.-C. Laprie, “Software reliability trend analy-ses from theoretical to practical considerations,” IEEE Trans.Software Eng., vol. 20, no. 9, pp. 740–747, 1994.

[32] H. Koziolek and F. Brosch, “Parameter dependencies for com-ponent reliability specifications,” in 6th International Work-shop on Formal Engineering approaches to Software Com-ponents and Architectures (FESCA), ser. Electronic Notes inTheoretical Computer Science, vol. 253, no. 1. Elsevier,2009, pp. 23 – 38.

[33] M. Z. Kwiatkowska, G. Norman, and D. Parker, “Probabilisticmodel checking in practice: case studies with PRISM,” SIG-METRICS Performance Evaluation Review, vol. 32, no. 4, pp.16–21, 2005.

[34] R. Lanotte, A. Maggiolo-Schettini, and A. Troina, “Para-metric probabilistic transition systems for system design andanalysis,” Formal Aspects of Computing, vol. 19, no. 1, pp.93–109, 2007.

[35] M. Marseguerra, E. Zio, and L. Podofillini, “Multiobjectivespare part allocation by means of genetic algorithms andmonte carlo simulation,” Reliability Engineering and SystemSafety, vol. 87, no. 3, pp. 325 – 335, 2005.

[36] A. Martens and H. Koziolek, “Automatic, model-based soft-ware performance improvement for component-based soft-ware designs,” in 6th International Workshop on FormalEngineering approaches to Software Components and Archi-tectures (FESCA). Elsevier, 2009.

[37] N. Medvidovic and S. Malek, “Software deployment ar-chitecture and quality-of-service in pervasive environments,”in Workshop on the Engineering of Software Services forPervasive Environements, ESSPE. ACM, 2007, pp. 47–51.

[38] M. Nicholson, “Selecting a topology for safety-critical real-time control systems,” Ph.D. dissertation, Department ofComputer Science, University of York, 1998.

[39] Y. Papadopoulos and C. Grante, “Evolving car designs us-ing model-based automated safety analysis and optimisationtechniques,” Journal of Systems and Software, vol. 76, no. 1,pp. 77–89, 2005.

[40] R. Reussner, H. W. Schmidt, and I. Poernomo, “Reliabilityprediction for component-based software architectures,” Jour-nal of Systems and Software, vol. 66, no. 3, pp. 241–252,2003.

[41] G. N. Rodrigues, D. S. Rosenblum, and S. Uchitel, “Usingscenarios to predict the reliability of concurrent component-based software systems,” in Fundamental Approaches to Soft-ware Engineering (FASE’05), ser. Lecture Notes in ComputerScience, M. Cerioli, Ed., vol. 3442. Springer, 2005, pp. 111–126.

[42] V. S. Sharma and K. S. Trivedi, “Quantifying softwareperformance, reliability and security: An architecture-basedapproach,” Journal of Systems and Software, vol. 80, no. 4,pp. 493–509, 2007.

[43] K. S. Trivedi, Probability and Statistics with Reliability,Queuing, and Computer Science Applications. EnglewoodCliffs, New Jersey: Prentice-Hall, 1982.

[44] W.-L. Wang, D. Pan, and M.-H. Chen, “Architecture-basedsoftware reliability modeling,” Journal of Systems and Soft-ware, vol. 79, no. 1, pp. 132–146, 2006.

[45] W.-L. Wang, Y. Wu, and M.-H. Chen, “An architecture-based software reliability model,” in Pacific Rim InternationalSymposium on Dependable Computing (PRDC’99). IEEEComputer Society, 1999, pp. 143–150.

238238

An Efficient Method for Architecture-Based Reliability ...

Documents

Transcript of An Efficient Method for Architecture-Based Reliability ...