Performability analysis of clustered systems with rejuvenation under varying workload

Performance Evaluation 64 (2007) 247–265www.elsevier.com/locate/peva

Performability analysis of clustered systems with rejuvenation undervarying workload

Dazhi Wanga,∗, Wei Xieb, Kishor S. Trivedic

a Department of Computer Science, Duke University, Durham, NC 27708, United Statesb Bank of America, 9 West 57th Street, New York, NY 10019, United States

c Department of Electrical and Computer Engineering, Duke University, Durham, NC 27708, United States

Received 27 August 2005; received in revised form 11 March 2006Available online 20 July 2006

Abstract

This paper develops time-based rejuvenation policies to improve the performability measures of a cluster system. Threerejuvenation policies, namely standard rejuvenation, delayed rejuvenation and mixed rejuvenation, are designed to improve thecluster’s performability under varying workload. Analytic models are built to evaluate these three policies. Since deterministictransitions are used in this paper and analytical models based on homogeneous continuous-time Markov chains (CTMC) donot allow non-exponential distributions, we utilize deterministic and stochastic Petri nets (DSPN), in which the underlyingstochastic process is a Markov regenerative process (MRGP), to capture both exponential and deterministic distributions. Systemperformability measures under these three rejuvenation policies are derived based on the DSPN models. We show that the mixedrejuvenation policy achieves the maximum performability among the three policies, which results in 12% improvement on thesystem throughput in the example shown in this paper. The delayed rejuvenation is better than the standard rejuvenation withrespect to the optimal job blocking probability and system throughput. For longer rejuvenation-triggering intervals, the standardrejuvenation yields a better result than delayed rejuvenation, while for shorter rejuvenation-triggering intervals the delayedrejuvenation policy outperforms standard rejuvenation policy.c© 2006 Elsevier B.V. All rights reserved.

Keywords: Clustered system; Performability; Software rejuvenation; Stochastic Petri net

1. Introduction

Fast development of new technologies has led to a large number of critical commercial applications on theInternet. As the users become dependant on these services, service failure or interruption can cause great loss forservice providers. Therefore high availability as well as high performance have become increasingly important tosatisfy more demanding quality of service (QoS) requirements. A widely adopted technique to significantly improvesystem availability and performance is clustering [10,29,30]. A cluster is a set of servers and related resources thatact like a single system and provide high availability, load balancing and parallel processing. These servers (also

∗ Corresponding author.E-mail addresses: [email protected] (D. Wang), [email protected] (W. Xie), [email protected] (K.S. Trivedi).

0166-5316/$ - see front matter c© 2006 Elsevier B.V. All rights reserved.doi:10.1016/j.peva.2006.04.002

http://www.elsevier.com/locate/peva

mailto:[email protected]



http://dx.doi.org/10.1016/j.peva.2006.04.002

https://www.researchgate.net/publication/2938129_Cluster-Based_Scalable_Network_Services?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

248 D. Wang et al. / Performance Evaluation 64 (2007) 247–265

known as nodes) are usually identical. If one node fails, another can act as a backup. Compared to the expensivehigh availability systems with proprietary tightly coupled hardware and software, cluster systems use commerciallyavailable computers networked in a loosely-coupled fashion, and provide high availability and performance in a cost-effective way.

Although cluster systems can achieve much higher availability than a single server, their availability still largelydepends on the availabilities of individual nodes in the cluster. Moreover, not all node failures are covered by faulttolerant mechanisms implemented in the cluster system. Unpredictable or undetectable node failures may cause thewhole system to malfunction due to imperfect fault coverage [38] in the cluster and the consequences could be veryserious. It has been reported that software faults and failures result in more outages in large computer systems thanhardware faults (e.g., see [1,16,33]) and they cause huge economic losses or risk to human lives (e.g., see [26]). Alarge percentage of the software failures is due to software aging [3,19]. These software failures can be attributed to agradual depletion in the resources needed by the running applications. Typical causes of aging are memory bloatingand leaking, data corruption, storage fragmentation, accumulation of round-off errors, unreleased network/databaseconnections and so on. It should be clear that software aging is closely related to workload, that is, the heavier theworkload is, the faster the depletion of resources [36].

To counteract software aging, [17] proposed software rejuvenation as a proactive fault management techniqueaimed at postponing/preventing crash failures and performance degradation. It involves occasionally stopping therunning software, cleaning its internal state and/or its environment and restarting it. This technique has been appliedto individual nodes [2,11] as well as to cluster systems [5]. In cluster systems, because of the node-level redundancy,when one node is taken offline for rejuvenation, other nodes can be configured to continue the service. Therefore thenode offline time is not counted as down time if handled properly in cluster systems, although the system capacityis temporarily reduced during this period. For cluster rejuvenation, [35] showed that using software rejuvenation cansignificantly improve the availability of cluster systems. Castelli et al. [5] developed algorithms to predict softwareaging and carry out rejuvenation on IBM xSeries servers. The rejuvenation in cable modem termination systems andits analysis using a Markovian model is conducted in [21]. However, most of the previous research concentrated onthe impact of rejuvenation on cluster availability, not system capacity and performability. It also did not consider theworkload and failure rate variations caused by user behavior patterns. For instance, the servers will have heavierworkload and shorter times to failure during peak period. Since rejuvenation does introduce planned outages toindividual nodes, it could be beneficial to differentiate between peak period and offpeak period in designing therejuvenation strategy. Thus in contrast to the work in [5,21] and [35], we focus on the performability modelingof the cluster system with software rejuvenation, taking into consideration the impact of varying workload on thesystem performability. To deal with varying workload, new rejuvenation policies in addition to the standard policyare developed. Furthermore, in contrast with previous efforts that used Markovian models we use deterministic andstochastic Petri net and develop an interesting solution method.

In our previous work [37], system throughput under two rejuvenation policies (standard and delayed rejuvenation)was studied for cluster systems with varying workload. In this paper we extend our previous work in the followingaspects:

1. The mixed rejuvenation policy is proposed besides standard and delayed rejuvenation to further improve the systemperformability under changing traffic patterns. Performability measures under these three rejuvenation policiesare thoroughly compared and various influencing factors with respect to the gain of each rejuvenation policy arerecognized and evaluated.

2. High fidelity models are developed to characterize system behavior. We analyze the system performability usingthe hierarchical modeling approach [32], in which various performability measures can be calculated as comparedto the simplified model in our previous work that focuses only on system throughput. Moreover, we take intoaccount the impact of workload on system performance as well as server failure rate.

3. Modeling techniques based on deterministic and stochastic Petri net (DSPN) are applied in this paper for moreaccurate analysis. We assume that most distributions in our model are exponential. It is unreasonable, however,to assume that the rejuvenation-triggering interval and durations of peak and offpeak periods are exponentiallydistributed. For the deterministic rejuvenation-triggering interval, we take recourse to the device of stages andapproximate the deterministic time to trigger rejuvenation by a 20-stage Erlang distribution, while for thedeterministic transitions between peak and offpeak periods, we keep them as they are. Therefore the overall model

https://www.researchgate.net/publication/221596066_Analysis_and_implementation_of_software_rejuvenation_in_cluster_systems?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==


https://www.researchgate.net/publication/4061288_Software_rejuvenation_policies_for_cluster_systems_under_varying_workload?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/3043925_Analysis_of_preventive_maintenance_in_transactions_based_software_systems?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/6039686_Fatal_Error_How_Patriot_Overlooked_a_Scud?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/4021604_Reliability_Analysis_of_Fault-Tolerant_Systems_with_Common-Cause_Failures?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/2954038_High-Availability_Computer_Systems?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/3998266_An_approach_for_estimation_of_software_aging_in_a_Web_server?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/3613155_Software_rejuvenation_Analysis_module_and_applications?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/222702162_A_proactive_approach_towards_always-on_availability_in_broadband_cable_network?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==


https://www.researchgate.net/publication/3152789_A_Workload-Based_Analysis_of_Software_Aging_and_Rejuvenation?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/3042187_Performability_Analysis_Measures_an_Algorithm_and_a_Case_Study?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/3827377_Measurement-based_model_for_estimation_of_resource_exhaustion_in_operational_software_systems?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/220498325_Proactive_management_of_software_aging?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==



https://www.researchgate.net/publication/220277820_Monitoring_Smoothly_Degrading_Systems_for_Increased_Dependability?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/2583415_Software_Defects_and_Their_Impact_on_System_Availability-A_Study_of_Field_Failures_in_Operating_Systems?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

D. Wang et al. / Performance Evaluation 64 (2007) 247–265 249

becomes a Markov regenerative process (MRGP) [18]. In order to facilitate the construction of the MRGP, weresort to a higher level graphical formalism known as the deterministic and stochastic Petri net (DSPN) [20,24]with exactly one deterministic transition enabled at any time. We adopt the solution techniques in [13,15] so thatthe DSPN can be solved by transient analysis of the subordinated Markov chains. Stochastic Petri net package(SPNP), a software tool developed by our research group, is used for this purpose.

The rest of the paper is organized as follows. In Section 2, three rejuvenation policies are described for clustersystems with varying workload. In Section 3, we elaborate the corresponding DSPN models and analysis techniques.Section 4 presents the numerical results and comparisons for the proposed rejuvenation policies. Section 5 gives theconcluding remarks.

2. Rejuvenation policies for clustered systems

We consider a cluster system with n identical nodes. These nodes could be either n physical boxes or n processesrunning in a single box, as long as they have the same configurations and may operate independently. The overallservice is not interrupted if k or more out of n nodes are operational. E-business infrastructure, Webserver clustersand application server clusters are examples of such cluster systems. Techniques such as session replication ensurethe independent operability of each node in Webserver clusters, that is, when one node is down, the other nodes canpick up its user sessions and resume the processing.

With no rejuvenation action applied to the software system, internal errors may naturally accumulate, availableresources may be exhausted, and consequently unplanned outages may occur. This phenomenon is known as softwareaging, and it directly depends on workload. Following [17], we assume that each node has four basic states: robuststate, failure-prone state, failure state, and rejuvenation state. To avoid the unexpected shutdown of individual nodes,rejuvenation actions are carried out according to certain policies. When a node is selected for rejuvenation, it is broughtdown and the total cluster system capacity is reduced. This is undesirable during peak period when the workload isheavy.

In this paper, three rejuvenation policies for the n-node cluster system under varying workload are studied:

• Policy-A: Standard rejuvenation. Similar to the time-based rejuvenation in [5], policy-A is a simple andstraightforward rejuvenation strategy:– In both peak period and offpeak period, the rejuvenation is triggered immediately after Tr time units have passed

since the last rejuvenation epoch or the recovery from system failure. The rejuvenation is carried out node bynode, taking one node in the cluster offline at a time, rejuvenating it then bringing it back online, until all nodesare rejuvenated. Tr is called rejuvenation trigger interval or time to trigger rejuvenation in this paper. Thisinterval is fixed without regard to the current workload, and its value can be optimized for a given objectivefunction.

• Policy-B: Delayed rejuvenation. Bringing functioning nodes down during peak period for maintenance obviouslycompromises system capacity and throughput. Therefore, we attempt to achieve a higher system throughput byconsidering a modified rejuvenation policy:– In offpeak period, the rejuvenation policy is the same as policy-A.– In peak period, all nodes are just scheduled for rejuvenation if time Tr has passed since last rejuvenation epoch or

last recovery from system failure. Nodes scheduled for rejuvenation still operate as usual until actual rejuvenationstarts immediately when next offpeak period starts.

• Policy-C: Mixed rejuvenation. This policy combines the standard rejuvenation and the delayed rejuvenationpolicies. As shown in later sections, sometimes rejuvenation during the early part of the peak period helps improvethe system throughput, while for rejuvenation scheduled during the latter part of the peak period, holding it off untilthe next offpeak period can achieve better performance. Therefore we propose the following rejuvenation policy:– In offpeak period, the rejuvenation policy is the same as policy-A.– In peak period, rejuvenation for each node will be scheduled if Tr time units have passed since last rejuvenation

epoch or last recovery from system failure. Suppose the rejuvenation is scheduled at time s ∈ [0, T1] whereT1 is the duration of peak period. If s is less than a fixed value s0 ∈ [0, T1], rejuvenation will be carried out.Otherwise rejuvenation will be delayed till the next offpeak period begins. Therefore the rejuvenation will bedone if scheduled early in the peak period, while it will be delayed if scheduled late in the peak period. The fixedvalue s0 is dependent on Tr , and in later sections we will show how to compute it.

https://www.researchgate.net/publication/3602671_New_results_for_the_analysis_of_deterministic_and_stochastic_Petri_nets?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/3509578_An_improved_numerical_algorithm_for_calculating_steady-statesolutions_of_deterministic_and_stochastic_Petri_net_models?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/226485699_On_Petri_nets_with_deterministic_and_exponentially_distributed_firing_times?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/3663531_Transient_analysis_of_Markov_regenerative_stochastic_Petri_nets_a_comparison_of_approaches?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==



https://www.researchgate.net/publication/51997186_Modeling_and_Analysis_of_Stochastic_Systems?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==


(a) Time-based rejuvenation. (b) Condition-based rejuvenation.

Fig. 1. State transition diagram for different rejuvenation policies.

Policy-A, policy-B and policy-C are all variants of time-based rejuvenation policies, which means the rejuvenationwill be performed after certain amount of time elapses, without having to know the state of each node. In otherwords, we do not consider condition-based rejuvenation policies in this paper that make rejuvenation decisions basedon the observation and detection of current state of each node. Compared to time-based approach, condition-basedrejuvenation has certain advantages in improving rejuvenation efficiency. However it also adds extra cost to the systemdue to system monitoring and aging estimation.

3. Rejuvenation model

3.1. Basic rejuvenation model

First we introduce the basic rejuvenation model for a system with only one node. Fig. 1(a) shows the state transitiondiagram with time-based rejuvenation policy. As a comparison, Fig. 1(b) shows the state transition diagram withcondition-based rejuvenation policy. The system has four states: up, failure-prone, down and rejuvenation state, whichare represented in Fig. 1 as UP, FP, DN and RE, respectively. The transition time distribution from the UP state to theFP state is F1(t) and the transition time distribution from the FP state to the DN state is F2(t). As mentioned in theprevious section, the time-based rejuvenation is to rejuvenate the system after Tr ∈ (0, ∞) time units have elapsedsince the last rejuvenation or recovery from system failure, without considering whether or not the system is in thefailure-prone state. Therefore the system state can change state either from UP to RE or from FP to RE. While forcondition-based rejuvenation, the transition to RE state can occur only when the system is in the FP state since thesystem can monitor whether or not it is in the FP state, thus avoiding unnecessary rejuvenation from the UP state.

In Fig. 1, we will assume that all transition times are exponentially distributed except for the transitions to RE statewhich are deterministic. Therefore the model in Fig. 1(a) becomes a Markov regenerative process [12], and the modelin 1(b) becomes a semi-Markov process [8]. In order to solve the Markov regenerative process in Fig. 1(a), we chooseto approximate the deterministic transition using an r -stage Erlang distribution, so the resulting process becomes ahomogeneous continuous-time Markov chain (CTMC). However, the state space of the CTMC also increases by rdue to the approximation. And if we wish to model a cluster with n nodes where each node has UP, FP, DN and REstates, the overall state space will become unmanageable if we were to build the CTMC by hand. In order to avoid themanual construction of a high fidelity model we may resort to a higher level formalism based on Stochastic Petri Nets(SPN) [9,28]. However to model the periodic workload Markovian Petri nets [7] are not powerful enough to capturethe deterministic transitions between peak period and offpeak period. Hence we use the deterministic and stochasticPetri net (DSPN) [24] to model the clustered systems under varying workload and different rejuvenation policies.

3.2. Introduction to Petri nets

Deterministic and stochastic Petri net is an extension of Petri net (PN) [31], which is a high level descriptionlanguage for formally specifying complex systems. A PN is a bipartite directed graph with two types of nodes: placesand transitions. Each place may contain an arbitrary (natural) number of tokens. For a graphical representation, placesare depicted as circles, transitions are shown as bars and tokens are indicated by dots or integers in the places. Each

https://www.researchgate.net/publication/3884641_Statistical_non-parametric_algorithms_to_estimate_the_optimalsoftware_rejuvenation_schedule?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==


https://www.researchgate.net/publication/241269986_Automated_Generation_and_Analysis_of_Markov_Reward_Models_Using_Stochastic_Reward_Nets?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/220331867_Performance_Analysis_Using_Stochastic_Petri_Nets?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/3626879_Analysis_of_software_rejuvenation_using_Markov_Regenerative_Stochastic_Petri_Net?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/266687045_Les_reseaux_de_Petri_stochastiques_Stochastic_Petri_nets?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/239576761_Petri_Net_Theory_and_The_Modeling_of_Systems?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==


transition may have zero or more input arcs, coming from its input places; and zero or more output arcs, going toits output places. A transition is enabled if all of its input places have at least as many tokens as required by themultiplicities of the corresponding input arcs. When enabled, a transition can fire and will remove from each inputplace and add to each output place the number of tokens corresponding to the multiplicities of the input/output arcs.A marking depicts the state of a PN which is characterized by the vector of tokens in all the places. With respect toa given initial marking, the reachability set is defined as the set of all markings reachable through any possible firingsequence of transitions, starting from the initial marking.

To study the performance and dependability issues of systems, a firing delay is associated with each transition inthe PN. This delay specifies the time that the transition remains enabled until it can actually fire. If all firing delaysare exponentially distributed, the resulting net is called a stochastic Petri net (SPN) [9,28].

Generalized stochastic Petri nets (GSPNs) [25] extend the SPNs by allowing zero firing time for some transitions.Transitions with exponentially distributed firing times are called timed transitions while the transitions with zerofiring times are called immediate transitions. A marking in a GSPN is called vanishing if at least one immediatetransition is enabled in it; otherwise it is called a tangible marking. For a given GSPN, an extended reachability graphis generated with the markings of the reachability set as the nodes and some stochastic information attached to thearcs, thus connecting the markings to each other. Under the condition that only a finite number of transitions can firein finite time with non-zero probability, it can be shown that a given extended reachability graph can be reduced toa homogeneous continuous time Markov chain (CTMC) [25]. GSPN also introduces inhibitor arcs. An inhibitor arcfrom a place to a transition disables the transition if the place contains at least as many tokens as the cardinality of theinhibitor arc. Graphically, an inhibitor arc is represented by a line terminated with a small circle.

In order to make more compact models of complex systems, several extensions are made to GSPN, leading to theSRN [7]. In an SRN, each tangible marking can be assigned with one or more reward rate(s). One of the most importantfeatures of SRN is its ability to allow extensive marking dependency. Parameters such as the firing rate of the timedtransitions, the multiplicities of input/output arcs and the reward rate in a marking can be specified as functions of thenumber of tokens in any place in the SRN. Another important characteristic of SRN is the ability to express complexenabling/disabling conditions through guard functions. This can greatly simplify the graphical representations ofcomplex systems. For an SRN, all the output measures are expressed in terms of the expected values of the rewardrate functions. To get the performance and reliability/availability measures of a system, appropriate reward rates areassigned to its SRN. As SRN is automatically transformed into a Markov reward model (MRM), thence steady stateand/or transient numerical solution of the MRM produces the required measures of the original SRN.

If the stochastic PN contains only immediate transitions and timed transitions with exponentially distributedfiring time, the underlying stochastic process is a homogeneous continuous-time Markov chain (CTMC). If thePetri net model contains at least one timed transition with generally distributed firing time, the underlying stochasticprocess becomes non-Markovian. However for most models there are certain time points embedded in the underlyingstochastic process at which the past history of the stochastic process is summarized in the current state, that is, thefuture evolution only depends on the present state entered when the regeneration point occurs. These points are calledregeneration points, and the stochastic process that satisfies this property is called a Markov Regenerative Process(MRGP) [18]. The Petri net that generates the MRGP is called a Markov regenerative Petri net (MRSPN) [6]. Ifall general transitions are deterministic, this special class of MRSPN is called deterministic and stochastic Petri net(DSPN) [20,24]. Techniques such as Markov renewal theory [18] or supplementary variables [14] can be applied tosolve DSPN models numerically. The deterministic transitions can also be approximated by Erlang distributions [23]in which case the DSPN becomes a Markovian stochastic Petri net and the solution techniques for Markov chains canbe applied. We use this technique in this paper to approximate the deterministic rejuvenation-triggering interval by a20-stage Erlang distribution. But the deterministic switching time between peak and offpeak periods is kept as it is,thence our model is a DSPN.

3.3. Model description

Fig. 2 shows the rejuvenation model for a clustered system. The upper-right part in Fig. 2 is the clock to triggerrejuvenation. The rejuvenation clock is triggered every Tr time units, and is modeled by the deterministic transitionTdet with constant firing time Tr . When Tdet fires, and if the immediate transition Tpolicy is enabled at that time,the token in Pclock will be moved to place Pstartrejuv, indicating a beginning of rejuvenation activity. For standard




https://www.researchgate.net/publication/220439407_A_Class_of_Generalized_Stochastic_Petri_Nets_for_the_Performance_Evaluation_of_Multiprocessor_Systems?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==



https://www.researchgate.net/publication/220253300_Markov_Regenerative_Stochastic_Petri_Nets?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/238866107_Selecting_and_implementing_phase_approximations_for_semi-Markov_models?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/3596736_Transient_Analysis_of_Deterministic_and_Stochastic_Petri_Nets_by_the_Method_of_Supplementary_Variables?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==





Fig. 2. Rejuvenation model for clustered system with periodic workload.

rejuvenation, Tpolicy is always enabled, while for delayed rejuvenation, Tpolicy is disabled during peak period, and formixed rejuvenation, Tpolicy is enabled for the initial time duration of length s0 and disabled thereafter in peak period.After the rejuvenation finishes, Treset will fire to return a token back to Pclock, thence beginning the next rejuvenationcycle. In order to make the model solvable by SPNP, we approximate the deterministic transition Tdet by an r -stageErlang distribution, as shown in Fig. 3. This is achieved by storing r tokens in place Pclock, and replacing Tdet by anexponentially distributed timed transition Terlang with firing rate r/T . At the same time, we change the multiplicitiesto r for the output arc of Treset and the input arc of Tpolicy.

Without rejuvenation, each server in the clustered system has three states: up, failure-prone, and failed. Tokensin Pup represent the number of up servers in the cluster, while tokens in Pfprone represent the number of failure-prone servers. Tfprone and Tnodefail represent the state transitions from up to failure-prone and from failure-prone tofailed, respectively. When a server fails, with probability c it goes to place Pnfdet and can be repaired successfullythrough the transition Trepair and with probability 1 − c the failure is not covered and leads to system failure.The system failure can also be caused by common mode failures which immediately cause system failure. This isrepresented by transition Tcmode. When the number of operational nodes is less than k of the n nodes, it is alsotreated as a system failure. The entire system can be repaired through transition Tsysrepair. If a system failure occurs,immediate transitions T1–T7 will fire emptying all tokens in the corresponding places through the variable cardinalityinput arcs into T1–T7, whose values are shown in Fig. 2. When the system is repaired, n tokens are restored intoplace Pup.


Fig. 3. Erlang approximation of the deterministic transition.

To carry out time-based rejuvenation, Tdet (Terlang in the approximation model) is enabled as long as the systemis up and Pclock is not empty. After Tr time units have passed, if Tpolicy is enabled at that time, a token is put intoPstartrejuv, which enables immediate transitions Timm1 and Timm2 and starts the rejuvenation activity. Each time, thesystem randomly picks up an up server or a failure-prone server, takes it offline, rejuvenates it, then brings it backonline. Firing of Timm1 means an up server is chosen, while firing of Timm2 means a failure-prone server is chosen.Trejuv1 is the time needed to rejuvenate an up server, and Trejuv2 is the time needed to rejuvenate a failure-prone server.The above process is repeated until all servers in the cluster are rejuvenated. During the period when the system isunder rejuvenation, a rejuvenated server may become failure-prone and finally fail. This is modeled by transitionsfrom Prejuved to Pfpronerej and from Pfpronerej to Pnodefail, where tokens in Prejuved represent rejuvenated servers, andthose in Pfpronerej represent rejuvenated servers that become failure-prone during the rejuvenation. After all up serversand failure-prone servers have finished rejuvenation, immediate transition Treset is enabled, which returns the token inPstartrejuv back to Pclock (r tokens into Pclock for the approximation model), and all tokens in Prejuved and Pfpronerej areput into Pup and Pfprone, respectively (achieved through variable cardinality input/output arcs into/out of transitionsTflush and Tfpflush). This finishes the current system rejuvenation, and the next rejuvenation cycle begins. If during therejuvenation-triggering interval a system failure occurs, the clock will be reset through the firing of transition T8 asshown in the approximation model of Fig. 3, which will move all tokens in Perlang back into Pclock. This is indicated byvariable cardinality input/output arcs into/out of T8. When the system is recovered, Tdet (Terlang in the approximationmodel) is enabled again by the guard function gavail.

The bottom-right part of Fig. 2 models the transitions between the peak periods and offpeak periods. A token inPpeak means it is peak period, and a token in Poffpeak means offpeak period. Transitions from peak to offpeak and fromoffpeak to peak are represented by the firing of the deterministic transitions Tpeak and Toffpeak, respectively. Clientrequests arrive at different rates in peak periods and in offpeak periods. Thus even after the Erlang approximation totransition Tdet the net is a DSPN.

The guard functions and arc multiplicities are shown in Fig. 2. The multiplicity of an arc is 1 if not specified.

Transition ratesSince a busy server is more likely to fail than an idle server, we capture this behavior by varying the server’s

transition rates from up to failure-prone and from failure-prone to failed as follows: suppose there are x requests in acluster with i servers, and the job is allocated to each server in a round-robin manner, then for each server the firingrate fx of a failure-related transition (either from up to failure-prone or from failure-prone to failed) is set to

fx =

fidle, if x = 0fbusy, if x ≥ ifidle · (1 − x/ i) + fbusy · x/ i if 0 < x < i

(1)

where fidle is the rate of the corresponding transition when the server is idle, and fbusy is the rate of the correspondingtransition when the server is busy. The transition from up to failure-prone and the transition from failure-prone tofailed may have different fbusy and fidle values.

Given a marking of the DSPN model in Fig. 2, the number of operational servers in the cluster can be determinedusing Eq. (4), and the pmf of the number of requests under that marking can be computed from the correspondingM/M/ i/m + i queuing model. Therefore the average transition rate for each of the failure-related transitions under


Table 1Rates for some transitions in the rejuvenation model

Tfprone

[βidle · p0 +

i∑x=1

(βidle ·

i−xi + βbusy ·

xi

)· px +

m+i∑x=i+1

βbusy · px

]· #Pup

Tfpronerej

[βidle · p0 +

i∑x=1

(βidle ·

i−xi + βbusy ·

xi

)· px +

m+i∑x=i+1

βbusy · px

]· #Prejuved

Tnodefailrej

[γidle · p0 +

i∑x=1

(γidle ·

i−xi + γbusy ·

xi

)· px +

m+i∑x=i+1

γbusy · px

]· #Pfpronerej

Tnodefail

[γidle · p0 +

i∑x=1

(γidle ·

i−xi + γbusy ·

xi

)· px +

m+i∑x=i+1

γbusy · px

]· #Pfprone

that marking can be written as

f =

m+i∑x=0

fx · px (2)

where px is the probability that there are x requests in the system.Using Eqs. (1) and (2), the transition rates for the DSPN in Fig. 2 are calculated in Table 1. βidle (βbusy) is the

transition rate from up state to failure-prone state when the server is idle (busy), while γidle (γbusy) is the transitionrate from failure-prone state to failed state when the server is idle (busy).

3.4. Performability analysis

Integrating system performance and dependability in a single model often causes the largeness and stiffnessproblems [4]. In order to compute performance measures for the cluster system under failures and rejuvenations,we use the hierarchical modeling approach in which the upper model describes the dependability for the clusteredsystem, while the lower model characterizes the system performance under each state of the dependability model [27,32]. Since the time-scale of performance-related events is at least two orders of magnitude separated from that ofreliability-related events, we can obtain the steady-state performance measures for each state of the dependabilitymodel, and assign these values as reward rates to that state. The overall system performance is computed as

Per f =

∑c∈Ω

rc · πc (3)

where πc is the probability that the MRGP underlying the DSPN model in Fig. 2 is in state c, rc is the systemperformance index under state c, and Ω is the state space of the MRGP.

The performance of the clustered system is modeled as a shared M/M/ i/m + i queue as shown in Fig. 4, wherei is the number of servers in the system, and m is the maximum queue length. Its parameters are determined by thecorresponding marking of the DSPN model in Fig. 2. For marking c of the DSPN model, i is computed as

i = i(c) = #Pup(c) + #Prejuved(c) + #Pfprone(c) + #Pfpronerej(c). (4)

Request arrival rate λ is computed as

λ = λ(c) =

λpeak, if #Ppeak(c) = 1λoffpeak, if #Poffpeak(c) = 1.

Due to the influence of software aging on the system performance, the service rates for up servers and failure-proneservers are different, denoted by µup and µfp respectively. Therefore the equivalent service rate µ of the M/M/ i/m+iqueue in marking c of the DSPN is computed as

µ = µ(c) =µup · (#Pup(c) + #Prejuved(c)) + µfp · (#Pfprone(c) + #Pfpronerej(c))

i.

https://www.researchgate.net/publication/222490048_Performability_a_retrospective_and_some_pointers_to_the_future?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/3048825_An_Aggregation_Technique_for_the_Transient_Analysis_of_Stiff_Markov_Chains?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==



Fig. 4. A M/M/ i/m + i queue.

Fig. 5. The state transition diagram of the D-subnet.

Given a M/M/ i/m + i queue with arrival rate λ and service rate µ, the steady-state probability vector p = (px )

is [34]

px = px (c) =

λx

x !·µx

i∑k=0

λk

k!·µk +

i+m∑k=i+1

λk

ik−i ·i !·µk

, 0 ≤ x ≤ i

λx

i x−i ·i !·µx

i∑k=0

λk

k!·µk +

i+m∑k=i+1

λk

ik−i ·i !·µk

, i < x ≤ m + i

where x represents the number of requests in the system. The blocking probability of M/M/ i/m + i is

Pb(c) = pm+i (c). (5)

The throughput is

Tput(c) = λ(c) · (1 − Pb(c)). (6)

For a tangible marking c ∈ Ω of the MRGP underlying the DSPN, if the number of available cluster nodes as inEq. (4) equals i , we set rc in Eq. (3) to the values in Eqs. (5) and (6) to get the overall blocking probability and systemthroughput respectively. Note that these are performability (as opposed to just performance) measures since the effectof down states and degraded states is taken into account by Eq. (3).

3.5. Stationary analysis of the DSPN

Because Tpeak and Toffpeak are deterministic transitions, and exactly one of Tpeak and Toffpeak is enabled at anytime, the model in Fig. 2 is a DSPN in which each marking has exactly one deterministic transition enabled. We viewthe overall DSPN as two subnets as shown in Fig. 2: D-subnet, which is cyclic and consists only of deterministictransitions; and SRN-subnet, which contains exponential and immediate transitions. For each marking of D-subnet,the evolution of the system is completely modeled by the SRN-subnet. Fig. 5 shows the state transition diagram of theD-subnet. The detailed system behavior in state Peak and Offpeak is modeled by the SRN-subnet in Fig. 2. Dependingon whether the D-subnet is in state Peak or Offpeak, the SRN-subnet will have different parameters such as transitionrates, guard functions, and reward rates.

In order to solve the DSPN using transient solution of CTMC models, we adopt the analysis method in [13,15]. Theidea is to solve the SRN-subnet iteratively as the D-subnet changes between peak period and offpeak period, usingthe computed probability vector at the end of the current period as the initial probability vector for the next period.Assume T1 and T2 are the durations for peak period and offpeak period respectively, and T = T1 + T2. Q1, Q2 arethe generator matrices for the subordinated Markov chains [20] of the SRN-subnet when the D-subnet is in Peak stateand Offpeak state, respectively. Assume the initial probability vector for the SRN-subnet is π0, and the D-subnet isinitially in Peak state. Then the probability vector for the SRN-subnet at time T −

1 can be written as

π(T −

1 ) = π0 · eQ1T1 .




https://www.researchgate.net/publication/220692715_Probability_Statistics_with_Reliability_Queuing_and_Computer_Science_Applications?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==


At time point T1 when the D-subnet changes from peak state to offpeak state, if one or more immediate transitions inSRN-subnet become enabled and fire, the state probability vector of the SRN-subnet at time T +

1 becomes

π(T +

1 ) = π0 · eQ1T1 · 11

where 11 = (δi j ) is the mapping matrix from states in peak-time subordinated CTMC to states in offpeak-timesubordinated CTMC, δi j is the probability that the firing of the immediate transitions will change the system fromstate i in the peak-time CTMC to state j in the offpeak-time CTMC. Similarly, at time point T = T1 + T2, the stateprobability vector of the SRN-subnet becomes

π(T +) = π0 · (eQ1T1) · 11 · (eQ2T2) · 12

where 12 = (δ′

i j ) is the mapping matrix from states in offpeak-time CTMC to states in peak-time CTMC. Let

M = (eQ1T1) · 11 · (eQ2T2) · 12

given an arbitrary time t , the state probability vector of the DSPN at time t is

π(t) =

π0 · Mbt/T c

· eQ1τ if τ ≤ T1

π0 · Mbt/T c· eQ1T1 · 11 · eQ2(τ−T1) if T1 < τ ≤ T1 + T2

(7)

where τ = t − T · bt/T c. We have assumed above that at time 0, the D-subnet is in the Peak state.Eq. (7) gives the transient analysis for the DSPN model of Fig. 2. From this equation we can see that the model does

not possess steady-state probabilities since it has periodic behavior [22]. However it is possible to acquire its stationaryprobabilities by computing the time-averaged limits representing the long-term proportion the DSPN spends in itsstates, that is, the stationary value for reward measure m can be written as

m = limx→∞

1x

∫ x

0π(t) · r(t)dt (8)

where r is the column vector of the reward rate.To carry out the stationary analysis, let vj = π( jT ). From Eq. (7) we have

vj = π( jT ) = π(( j − 1)T ) · M = vj−1M.

As j → ∞, vj converges to a fixed vector v = lim j→∞ π( jT ), that is, the system behavior tends to beprobabilistically identical for each period T as t goes to infinity. Therefore v satisfies

v = v · M. (9)

In order to compute the stationary reward m for the DSPN, from Eq. (8) we can get

m = limj→∞

∫ jT0 π(t) · r(t)dt

jT= lim

j→∞

∫ jT( j−1)T π(t) · r(t)dt

T.

By setting v as the initial probability vector, m becomes

m = limj→∞

∫ jT( j−1)T π(t) · r(t)dt

T=

∫ T0 π(t) · r(t)dt

T=

∫ T10 π(t)dt · r1 +

∫ T1+T2T1

π(t)dt · r2

T(10)

where r1 is the reward rate vector at peak period, and r2 is the reward rate vector at offpeak period. Therefore thecomputation of the stationary measures can be done in two steps: first use Eq. (9) to compute v, then set v as the initialprobability vector and use Eq. (10) to compute the expected value of m at interval [0, T ].

For mixed rejuvenation policy, it divides the peak period [0, T1] into two sub-intervals: [0, s0] and [s0, T1]. In [0, s0]

the transition Tpolicy needs to be enabled in order to carry out the rejuvenation; while in [s0, T1], Tpolicy should bedisabled to delay the rejuvenation until the offpeak period. Therefore the SRN-subnet in Fig. 2 has different behaviorin these two intervals, and the solution method presented above is modified to solve the model with mixed rejuvenationpolicy. The change is simple: instead of having two states in the D-subnet, Peak and Offpeak, the Peak state is split into

https://www.researchgate.net/publication/224667365_Transient_analysis_of_the_leaky_bucket_rate_control_scheme_under_Poisson_and_ON-OFF_sources?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==


Fig. 6. The state transition diagram of modified D-subnet.

Fig. 7. Algorithm for stationary analysis of DSPN.

two states: Peak1 and Peak2, where the duration of Peak1 is s0, and the duration of Peak2 is T1 − s0. Fig. 6 shows thestate transition diagram for the modified D-subnet. When the D-subnet is in state Peak1, Tpolicy is enabled and requestarrival rate λ is set to λpeak. When the D-subnet is in state Peak2, Tpolicy is disabled and λ is set to λpeak. When the D-subnet is in state Offpeak, Tpolicy is enabled and λ is set to λoffpeak. By modifying the D-subnet and the correspondingparameters in the SRN-subnet, the same algorithm presented above can be applied to solve the model under mixedrejuvenation policy. Fig. 7 shows the algorithm for stationary analysis of the DSPN. In this figure, peak1, peak2and peak3 represent the SRN-subnets when the D-subnet is in either Peak1, Peak2 or Peak3. set init prob(sn,vprob) set vprob to be the initial probability vector of the SRN sn. solve(sn, x) returns the state probabilityvector of the SRN sn at time x and the cumulative measure

∫ x0 π(t) · rdt , where π(t) is the state probability vector

at t and r is the reward rate vector defined in sn. The stationary value of the measure m is computed accordingto Eq. (10).

4. Numerical results

Table 2 shows how the number of cluster nodes in the model influences the number of states in the underlyingCTMC of the SRN-subnet as well as the number of non-zero entries in the corresponding generator matrix. As seenfrom the table, the number of states and the number of non-zero entries increases very fast with the number of clusternodes in the model. Therefore it is unmanageable to produce the CTMC by hand, however by resorting to SPN modeland the Stochastic Petri Net Package (SPNP) analysis tool, the CTMC can be generated automatically and quickly.

The default parameters used in the model of the previous section are shown in Table 3. These parameters are formodel illustration purpose only. SPNP is used to solve the model.

4.1. System availability

As defined earlier, the system is available if k out of n nodes in the cluster are up. Fig. 8(a) demonstrates the systemavailability with different s0 and Tr values, where s0 ≤ 0 corresponds to policy-B, s0 > Tpeak corresponds to policy-A,and 0 < s0 ≤ Tpeak corresponds to policy-C. Fig. 8(b) is the availability comparison of the clustered system under


Table 2Number of states and non-zero entries vs. number of nodes in the cluster

Number of nodes n 2 3 4 5 6 7 8 9 10

Number of statesk = 1 115 219 364 560 819 1155 1584 2124 2795k = 2 65 169 314 510 769 1105 1534 2074 2745

Nonzero entriesk = 1 380 823 1479 2414 3710 5465 7793 10 824 14 704k = 2 168 587 1243 2178 3474 5229 7557 10 588 14 468

Table 3Parameters used in the model

Parameter Default value Comments

λpeak 5000 s−1 Request arrival rate during peak periodλoffpeak 1000 s−1 Request arrival rate during offpeak periodµup 1500 s−1 Service rate of an up serverµfp 1000 s−1 Service rate of a failure-prone servern 4 Total number of servers in the clusterk 2 k out of n servers need to be upm 256 Maximum queue length in the clusterβidle 1/72 h−1 Rate from up to failure-prone for single idle serverβbusy 1/24 h−1 Rate from up to failure-prone for single busy serverγidle 1/72 h−1 Rate from failure-prone to down for single idle serverγbusy 1/24 h−1 Rate from failure-prone to down for single busy serverTpeak 12 h Duration of peak periodToffpeak 12 h Duration of offpeak period

(a) Availability vs s0 and Tr . (b) Availability comparison.

Fig. 8. System availability with different rejuvenation policies.

rejuvenation policy-A, B and C. The x-axis is the rejuvenation-triggering interval Tr . For each rejuvenation-triggeringinterval Tr , the s0 that minimizes the job blocking probability is chosen for policy-C. As shown in the figure, thesystem availability under each policy is higher than the one without rejuvenation. Policy-A is better than policy-Band C for any rejuvenation-triggering interval. This is because policy-B does not allow rejuvenation during peakperiod, which contributes a large part to the system unavailability. Since policy-C aims at reducing the job blockingprobabilities, it does not result in a better availability than policy-A.


(a) Blocking probability vs s0 and Tr . (b) Blocking probability comparison.

Fig. 9. Job blocking probability with different rejuvenation policies.

In Fig. 8 and later figures we can see the curves of policy-B and C begin to separate around time 10 h. That isbecause when the rejuvenation-triggering interval is less than 10 h, delayed rejuvenation is always better than thestandard rejuvenation in reducing job blocking probability. As Tr increases, policy-C begins to carry out rejuvenationin [0, s0] during peak period. Therefore policy-C approaches the availability curve of policy-A faster than policy-B.

4.2. Blocking probability

A job is blocked either because the system is unavailable, or the number of requests reaches the maximum.Therefore for system failure states, the blocking probability is 1; and for other states the blocking probability atthat state is computed through Eq. (5). The overall job blocking probability is computed using Eq. (10).

Fig. 9(a) shows the job blocking probability for various s0 and Tr values. Similar to Fig. 8(a), s0 ≤ 0 correspondsto policy-B, s0 > Tpeak corresponds to policy-A, and 0 < s0 ≤ Tpeak corresponds to policy-C. Fig. 9(b) showsthe blocking probability comparison of policy-A, B and C. For each rejuvenation-triggering interval Tr , the s0 thatminimizes the job blocking probability is chosen for policy-C. The x-axis is the rejuvenation-triggering interval,y-axis is the blocking probability. As shown in Fig. 9, software rejuvenation can greatly reduce the job blockingprobability. Policy-C is the best for any rejuvenation-triggering interval. As Tr increases, all three curves convergeto the same value. Policy-B outperforms policy-A when rejuvenation-triggering interval Tr is small; while policy-Asometimes outperforms policy-B as Tr becomes larger. The reason is that under policy-A the system is rejuvenatedfrequently when Tr is small, hence there are not enough nodes to handle the fast incoming requests at peak times.This contributes a large increase in job blocking probability compared to policy-B, which does not do rejuvenation atpeak times. As Tr increases, the number of rejuvenations is reduced at peak period, therefore the blocking probabilityfor policy-A will decrease and converge to that of policy-B. And the rejuvenation in peak period could help bringthe failure-prone (degraded) servers back to up (fully-operational) state, thus improving the system service rate andreduce the job blocking probability.

For policy-A, the optimal rejuvenation-triggering interval is around 12 h, and for policy-B it is around 2.2 h. Theoptimal blocking probability under policy-A is 0.0174, while the one under policy-B is 0.0114.

4.3. Throughput

The throughput is 0 if the system is unavailable; for all the available states, the throughput for each state is computedby Eq. (6). Figs. 10 and 11 show the system throughput versus peak-hour request arrival rates while keeping theoffpeak request arrival rate constant. The rejuvenation-triggering intervals for Figs. 10 and 11 are 12 h and 2.2 h,which are the optimal values for policy-A and policy-B respectively. Given the request arrival rate, the maximumthroughput is solely determined by the job blocking probability according to Eq. (6). Therefore policy-C is still the


(a) Throughput. (b) Comparison of policy-A, policy-B and policy-C.

Fig. 10. System throughput for different rejuvenation policies, Tr = 12 h.

best among the three rejuvenation policies. As shown in the figures, the system without rejuvenation reaches itsbottleneck much faster than the ones with rejuvenation, hence rejuvenation helps in improving the system throughput.When Tr = 12 h, the maximum throughput under policy-A (3250 requests/s) is higher than that under policy-B(3209 requests/s), and policy-C has the highest throughput (3254 requests/s). For Tr = 2.2 h, the curves of policy-B and policy-C are overlapped, and both are better than policy-A. The maximum throughput of policy-B and C is3266 requests/s and that of policy-A is 3199 requests/s. The throughput without rejuvenation is 2905 requests/s. Thethroughput is improved by up to 12% under policy-B and C.

4.4. Influence of peak period duration

In this section we study the impact of peak period duration Tpeak on the optimal rejuvenation-triggering interval Trthat minimizes the job blocking probability. Fig. 12(a) shows the optimal Tr versus Tpeak under policy-A and policy-Brespectively, while Fig. 12(b) shows the corresponding optimal job blocking probability under these two policies,as well as the one without rejuvenation. The impact on policy-C is not shown in this figure since its job blockingprobability is determined by the pair (Tr , s0), not solely by Tr . Fig. 12 is acquired using the default parameters inTable 3 except for peak period duration Tpeak and offpeak period duration Toffpeak. We vary Tpeak from 5 h to 18 h(Toffpeak is changed accordingly). As seen from the figure, the optimal Tr becomes shorter as Tpeak becomes longer.That is because the heavy workload in peak hours causes the system to deteriorate faster than in offpeak hours. AsTpeak increases, the system needs to be rejuvenated more frequently. The optimal Tr under policy-B is much smallerthan the optimal Tr under policy-A, since policy-B does not allow rejuvenation during peak hours, which practicallyincreases the average rejuvenation-triggering interval. The job blocking probability increases with Tpeak because inpeak hours jobs are blocked more often than in offpeak hours. For both policies, the optimal Tr as well as job blockingprobability changes almost linearly with respect to the peak period duration.

4.5. Influence of node MTTF

The system mean time to failure (MTTF) is another factor that influences the optimal rejuvenation-triggeringinterval. Given a cluster system with fixed number of nodes, the system MTTF is determined by the MTTF of eachnode. In this section we study the relationship between the optimal rejuvenation-triggering interval Tr and the MTTFof individual node. As mentioned in Section 3.3, without rejuvenation each node has three states: up, failure-prone,and failed. The node MTTF is the mean time of the node state changing from up state to failure-prone state. It dependson the firing rates of Tfprone and Tnodefail in Table 1, which are in turn decided by parameters βbusy, βidle, γbusy andγidle in Table 3. To vary the node MTTF we multiply βbusy, γbusy, βidle and γidle by a factor α. In Fig. 13, α ranges


(a) Throughput.

(b) Comparison of policy-A, policy-B and policy-C.

Fig. 11. System throughput for different rejuvenation policies, Tr = 2.2 h.

from 0.3 to 2.5, thus the MTTF of each node decreases from 320 to 38.4. Fig. 13(a) and (b) show the correspondingoptimal rejuvenation-triggering interval and job blocking probability under policy-A and policy-B, respectively. Thejob blocking probability without rejuvenation is also shown in Fig. 13(b). As seen from the figure, the optimal Trincreases with the node MTTF, and its increase under policy-A is faster than policy-B. The job blocking probabilityis reduced as the node MTTF increases, and the improvement is more significant when node MTTF is small.

4.6. Influence of performance degradation

As software aging often causes system performance degradation, the severity of degradation is an importantfactor for determining the optimal rejuvenation-triggering interval Tr . Fig. 14(a) shows the impact of performancedegradation on optimal Tr that minimizes job blocking probability, and Fig. 14(b) shows the corresponding jobblocking probabilities for policy-A and policy-B, as well as the one without rejuvenation. The figure is acquiredusing default parameters in Table 3 except for the service rate of the failure-prone server µfp. We set µfp = α · µup,where α indicates the performance degradation severity of the failure-prone server, smaller α represents more severeperformance degradation. It is varied from 0.3 to 1 in Fig. 14. As seen from the figure, the optimal Tr increases with α,


(a) Optimal rejuv. interval. (b) Optimal job blocking prob.

Fig. 12. Influence of peak period duration under different rejuvenation policies.

(a) Optimal rejuv. interval. (b) Optimal job blocking prob.

Fig. 13. Influence of node MTTF under different rejuvenation policies.

and the Tr under policy-A increases much faster than policy-B. For job blocking probability, from Fig. 14(b) we cansee that rejuvenation can greatly increase system performance when α is small, and policy-A outperforms policy-Bunder severe performance degradations. This is because the performance deteriorates so fast under such circumstancesthat rejuvenation during peak hours is necessary. As α increases, policy-B can achieve more performance improvementcompared to policy-A and no rejuvenation.

5. Conclusions

We have discussed three rejuvenation policies for a cluster system under varying workload. In policy-A (standardrejuvenation), the rejuvenation is carried out as soon as the rejuvenation-triggering interval is reached, ignoringwhether the system is in peak period (high workload) or offpeak period (low workload). While in policy-B (delayedrejuvenation), all nodes are merely scheduled for rejuvenation if the rejuvenation time is reached during peak period,and the actual rejuvenation is started as soon as the next offpeak period starts. By postponing the rejuvenation to


(a) Optimal rejuv. interval.

(b) Optimal job blocking prob.

Fig. 14. Influence of performance degradation under different rejuvenation policies.

offpeak period in policy-B, we expect to see a higher overall system performance in certain circumstances. Policy-Cis the combination of policy-A and policy-B. When the rejuvenation is scheduled early in peak period, it is doneimmediately; otherwise the rejuvenation is delayed till the next offpeak period begins.

We have constructed DSPN models for the cluster system with different rejuvenation policies under consideration,and the system performability is analyzed. Due to the complexity of the model, we turn to a powerful tool, SPNP,to carry out the numerical solution. Based on the parameters chosen in this paper, we have shown that by aimingat reducing the job blocking probability, policy-C achieves the best result in job blocking probability and thesystem throughput for any rejuvenation-triggering interval. Policy-A achieves the best system availability. Althoughpolicy-C does not outperform policy-A for system availability, the curve of policy-C approaches that of policy-Avery fast. Policy-B is likely to outperform policy-A under optimal rejuvenation-triggering interval in terms of theexpected system throughput and job blocking probability, although under certain conditions policy-A has the sameor even a better performance than policy-B. We also examined the influence of peak period duration, node MTTFand performance degradation severity on the optimal rejuvenation-triggering interval as well as the job blockingprobability.


References

[1] A. Avritzer, E. Weyuker, Monitoring smoothly degrading systems for increased dependability, Empirical Software Engineering 2 (1997)55–77.

[2] Y. Bao, X. Sun, K.S. Trivedi, A workload-based analysis of software aging and rejuvenation, IEEE Transactions on Reliability 54 (4) (2005)541–548.

[3] L. Bernstein, C.M.R. Kintala, Software rejuvenation, CrossTalk - The journal of Defense Software Engineering 6 (2) (2004) 8–11.[4] A. Bobbio, K.S. Trivedi, An aggregation technique for the transient analysis of stiff Markov chains, IEEE Transactions on Computers 35 (9)

(1986) 803–814.[5] V. Castelli, R.E. Harper, P. Heidelberger, S.W. Hunter, K.S. Trivedi, K. Vaidyanathan, W.P. Zeggert, Proactive management of software aging,

IBM Journal of Research and Development 45 (2) (2001) 311–332.[6] H. Choi, V.G. Kulkarni, K.S. Trivedi, Markov regenerative stochastic Petri nets, Performance Evaluation 20 (1994) 335–357.[7] G. Ciardo, A. Blakemore, P.F. Chimento, J.K. Muppala, K.S. Trivedi, Automated generation and analysis of Markov reward models using

stochastic reward nets, in: C. Meyer, R. Plemmons (Eds.), Linear Algebra, Markov Chains and Queuing Models, vol. 48, Springer, 1993,pp. 145–191.

[8] T. Dohi, K.G. Popstojanova, K.S. Trivedi, Statistical non-parametric algorithms to estimate the optimal software rejuvenation schedule, in:Proceedings of the Pacific Rim International Symposium on Dependable Computing, Los Angeles, California, December 2000, pp. 77–84.

[9] G. Florin, S. Natkin, Les reseaux de Petri stochastiques, Technique et Science Informatiques 4 (1) (1985) 143–160.[10] A. Fox, S.D. Gribble, Y. Chawathe, E.A. Brewer, P. Gauthier, Cluster-based scalable network services, in: Proceedings of the Sixteenth ACM

Symposium on Operating Systems Principles, 1997, pp. 78–91.[11] S. Garg, A. Puliafito, M. Telek, K.S. Trivedi, Analysis of preventive maintenance in transactions based software systems, IEEE Transactions

on Computers 47 (1) (1998) 96–107.[12] S. Garg, A. Puliafito, M. Telek, K.S. Trivedi, Analysis of software rejuvenation using Markov regenerative stochastic Petri nets, in:

Proceedings of the International Symposium on Software Reliability Engineering, Toulouse, France, October 1995, pp. 180–187.[13] R. German, New results for the analysis of deterministic and stochastic Petri nets, in: Proceedings of the IEEE International Computer

Performance and Dependability Symposium, Erlangen, Germany, 1995, pp. 114–123.[14] R. German, Transient analysis of deterministic and stochastic Petri nets by method of supplementary variables, in: Proceedings of the IEEE

International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, Durham, NC, 1995.[15] R. German, D. Logothetis, K.S. Trivedi, Transient analysis of Markov regenerative stochastic Petri nets: A comparison of approaches, in:

Proceedings of the 6th International Workshop on Petri Nets and Performance Models, Durham, NC, 1995, pp. 103–112.[16] J. Gray, D.P. Siewiorek, High-availability computer systems, IEEE Transactions on Computers 24 (1991) 39–48.[17] Y. Huang, C. Kintala, N. Kolettis, N. Fulton, Software rejuvenation: analysis, module, and applications, in: Proceedings of the 25th

International Symposium on Fault-tolerance Computing Symposium, June 1995, pp. 381–390.[18] V.G. Kulkarni, Modeling and Analysis of Stochastic Systems, Chapman & Hall, 1995.[19] L. Li, K. Vaidyanathan, K.S. Trivedi, An approach to estimation of software aging in a web server, in: Proceedings of the International

Symposium on Empirical Software Engineering, Nara, Japan, October 2002, pp. 91–100.[20] C. Lindemann, An improved numerical algorithm for calculating steady-state solutions of deterministic and stochastic Petri net models,

Performance Evaluation 8 (1) (1993) 79–95.[21] Y. Liu, Y. Ma, J. Han, H. Levendel, K.S. Trivedi, A proactive approach towards always-on availability in broadband cable networks, Computer

Communications 28 (1) (2005) 51–64.[22] D. Logothetis, K.S. Trivedi, Transient analysis of the leaky bucket rate control scheme under poisson and on-off sources, in: Proceedings of

the IEEE Conference on Computer Communications, June 1994.[23] M. Malhotra, A.L. Reibman, Selecting and implementing phase approximations for semi-Markov models, Stochastic Models 9 (4) (1993)

473–506.[24] M.A. Marsan, G. Chiola, On Petri nets with deterministic and exponentially distributed firing times, in: Lecture Notes in Computer Science,

vol. 266, Springer, 1986, pp. 132–145.[25] M.A. Marsan, G. Conte, G. Balbo, A class of generalized stochastic Petri nets for the performance evaluation of multiprocessor systems,

ACM Transactions on Computer Systems 2 (2) (1984) 93–122.[26] E. Marshall, Fatal error: How patriot overlooked a scud, Science (1992) 1347.[27] J.F. Meyer, Performability: A retrospective and some pointers to the future, Performance Evaluation 14 (1992) 139–156.[28] M.K. Molloy, Performance analysis using stochastic Petri nets, IEEE Transactions on Computers 31 (9) (1982) 913–917.[29] BEA White Paper, Achieving scalability and high availability for e-business, clustering in bea weblogic server,

http://www.bea.com/content/news events/white papers/BEA WL Server Clustering wp.pdf, 2003.[30] IBM White Paper, Server clusters for high availability in websphere application server network deployment edition 5.0, http://www-

1.ibm.com/support/docview.wss?uid=swg27002473&aid=1, 2003.[31] J.L. Peterson, Petri Net Theory and the Modeling of Systems, Prentice-Hall, Englewood Cliffs, NJ, 1981.[32] R.M. Smith, A. Ramesh, K.S. Trivedi, Performability analysis: Measures, an algorithm and a case study, IEEE Transactions on Computers 37

(4) (1988) 406–417.[33] M. Sullivan, R. Chillarege, Software defects and their impact on system availability—a study of field failures in operating systems, in:

Proceedings of the 21st IEEE International Symposium on Fault-Tolerant Computing, Montreal, Canada, 1991, pp. 2–9.[34] K.S. Trivedi, Probability and Statistics with Reliability, Queuing, and Computer Science Applications, John Wiley & Sons, 2001.

http://www.bea.com/content/news_events/white_papers/BEA_WL_Server_Clustering_wp.pdf

http://www-1.ibm.com/support/docview.wss?uid=swg27002473&aid=1

http://www-1.ibm.com/support/docview.wss?uid=swg27002473&aid=1











https://www.researchgate.net/publication/222490048_Performability_a_retrospective_and_some_pointers_to_the_future?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==












https://www.researchgate.net/publication/6039686_Fatal_Error_How_Patriot_Overlooked_a_Scud?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/220692715_Probability_Statistics_with_Reliability_Queuing_and_Computer_Science_Applications?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

https://www.researchgate.net/publication/2954038_High-Availability_Computer_Systems?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==










https://www.researchgate.net/publication/220253300_Markov_Regenerative_Stochastic_Petri_Nets?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==

















https://www.researchgate.net/publication/239576761_Petri_Net_Theory_and_The_Modeling_of_Systems?el=1_x_8&enrichId=rgreq-47e4d7c099b193fc327f62954fb02c55-XXX&enrichSource=Y292ZXJQYWdlOzIyMjU4MTAxMjtBUzoxMDM2OTc5MDAxMTM5MjVAMTQwMTczNDkxNDM4MQ==




[35] K. Vaidyanathan, R.E. Harper, S.W. Hunter, K.S. Trivedi, Analysis and implementation of software rejuvenation in cluster systems,in: Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems, ACM SIGMETRICS2001/Performance 2001, Cambridge, MA, June 2001, pp. 62–71.

[36] K. Vaidyanathan, K.S. Trivedi, A measurement based model for estimation of resource exhaustion in operational software systems, in:Proceedings of the Tenth International Symposium on Software Reliability Engineering, Boca Raton, FL, November 1999, pp. 84–93.

[37] W. Xie, Y. Hong, K.S. Trivedi, Software rejuvenation policies for cluster systems under varying workload, in: Proceedings of the 10th IEEEPacific Rim International Symposium on Dependable Computing, 2004, pp. 122–129.

[38] L. Xing, Reliability analysis of fault-tolerant systems with common-cause failures, in: Proceedings of The International Conference onDependable Systems and Networks, San Francisco, CA, June 2003, pp. 689–698.

Dazhi Wang received the B.S. degree from University of Science and Technology of China in 2000. He is a Ph.D candidatein the Department of Computer Science of Duke University. His research interests include BDD-based reliability analysisalgorithms for combinatorial models, software rejuvenation and user-perceived service availability modeling.

Wei Xie is currently a Vice President of Bear Stearns & Co. Inc. Before joining Bear, he was in the MBS/ABS TradingAnalytics group of Bank of America in New York. Prior to that, he worked for AT&T Research Labs, IBM, and NCRCorporation. Dr. Xie obtained his Ph.D. and M.S. in Duke University, and B.S. in Tsinghua University.

Kishor S. Trivedi holds the Hudson Chair in the Department of Electrical and Computer Engineering at Duke University,Durham, NC. He has been on the Duke faculty since 1975. He is the author of a well known text entitled, Probabilityand Statistics with Reliability, Queuing and Computer Science Applications, published by Prentice-Hall; this text hasbeen reprinted as an India edition; the second edition of this book has just appeared. He has recently published two otherbooks entitled, Performance and Reliability Analysis of Computer Systems, published by Kluwer Academic Publishers andQueueing Networks and Markov Chains, John Wiley. His research interests are in reliability and performance assessmentof computer and communication systems. He has published over 300 articles and lectured extensively on these topics.He has supervised 35 Ph.D. dissertations. He is a Fellow of the Institute of Electrical and Electronics Engineers. He is aGolden Core Member of IEEE Computer Society.










Performability analysis of clustered systems with rejuvenation under varying workload

Documents

Transcript of Performability analysis of clustered systems with rejuvenation under varying workload