The Eect of Server Reallocation Time in Dynamic Resource Allocation

13
The Effect of Server Reallocation Time in Dynamic Resource Allocation M. Al-Ghamdi A.P. Chester J.W.J. Xue S.A. Jarvis * Abstract It is common for Internet service hosting centres to dedicate server resources to different applications such that revenue is maximised through efficient use of the available resources. Dynamic resource allocation has been shown to provide a significant increase in total revenue through the reallocation of available resources in accordance with changes in the workloads on each applications’ resources. In this paper multi-tiered enterprise applications are mod- elled as multi-class closed queuing networks, with each network station corresponding to an application tier. The effects of server reallocation time are evaluated through simulation. The experimental results demonstrate that reallocation time is inversely proportional to total revenue obtained. 1 Introduction Internet hosting centres are often used for the cost-effective hosting of enterprise applications. Typically theses enterprise applications employ a multi-tier architecture, which provides a clear separation of roles between the tiers. Commonly a multi-tier architecture consists of three tiers; a client-facing web tier, an application tier for the application logic and a data persistence tier that is usually comprised of a relational database management system (RDBMS). At each tier servers may be clustered to provide high-availability and improve performance. An Internet hosting centre may host many multi-tier applications for its clients, each of which will have a separate service level agreement (SLA). The SLA defines the level of service agreed between the client and the hosting centre and may include performance and availability targets, with penalties to be paid if such targets are not met. It is in the interests of the service hosting centre to ensure that its SLAs are met so that it can maximise its revenue, whilst ensuring that its resources are well utilised. In this work we model the typical enterprise system using a multi class closed queuing network to compute the various performance metrics. The advantage of using an analytical model is that we can easily capture the different performance metrics, and identify potential bottlenecks without running the actual system. The model can also react to parameter changes when the application is running (e.g. from the monitoring tools or system logs) and make dynamic server switching decisions to optimise pre-defined performance metrics [20]. Workloads for internet services have been shown to be bursty with large variations in demand [1], [3], [22]. When static resource allocation policies are in use they may not be able to handle large surges in traffic, leading to SLA violations and reduced revenues. Dynamic resource alloca- tion systems have been shown to provide a significant increase in revenue in such environments by reallocating servers into a more profitable configuration [20]. * Department of Computer Science, University of Warwick, {mhd}@dcs.warwick.ac.uk

Transcript of The Eect of Server Reallocation Time in Dynamic Resource Allocation

The Effect of Server Reallocation Time in

Dynamic Resource Allocation

M. Al-Ghamdi A.P. Chester J.W.J. Xue S.A. Jarvis∗

Abstract

It is common for Internet service hosting centres to dedicate server resources to differentapplications such that revenue is maximised through efficient use of the available resources.Dynamic resource allocation has been shown to provide a significant increase in total revenuethrough the reallocation of available resources in accordance with changes in the workloadson each applications’ resources. In this paper multi-tiered enterprise applications are mod-elled as multi-class closed queuing networks, with each network station corresponding to anapplication tier. The effects of server reallocation time are evaluated through simulation.The experimental results demonstrate that reallocation time is inversely proportional tototal revenue obtained.

1 Introduction

Internet hosting centres are often used for the cost-effective hosting of enterprise applications.Typically theses enterprise applications employ a multi-tier architecture, which provides a clearseparation of roles between the tiers. Commonly a multi-tier architecture consists of three tiers;a client-facing web tier, an application tier for the application logic and a data persistence tierthat is usually comprised of a relational database management system (RDBMS). At each tierservers may be clustered to provide high-availability and improve performance.

An Internet hosting centre may host many multi-tier applications for its clients, each of whichwill have a separate service level agreement (SLA). The SLA defines the level of service agreedbetween the client and the hosting centre and may include performance and availability targets,with penalties to be paid if such targets are not met. It is in the interests of the service hostingcentre to ensure that its SLAs are met so that it can maximise its revenue, whilst ensuring thatits resources are well utilised.

In this work we model the typical enterprise system using a multi class closed queuing networkto compute the various performance metrics. The advantage of using an analytical model isthat we can easily capture the different performance metrics, and identify potential bottleneckswithout running the actual system. The model can also react to parameter changes when theapplication is running (e.g. from the monitoring tools or system logs) and make dynamic serverswitching decisions to optimise pre-defined performance metrics [20].

Workloads for internet services have been shown to be bursty with large variations in demand[1], [3], [22]. When static resource allocation policies are in use they may not be able to handlelarge surges in traffic, leading to SLA violations and reduced revenues. Dynamic resource alloca-tion systems have been shown to provide a significant increase in revenue in such environmentsby reallocating servers into a more profitable configuration [20].

∗Department of Computer Science, University of Warwick, {mhd}@dcs.warwick.ac.uk

There are several considerations in a dynamic resource allocation system. These includethe decision interval, which is the time taken between evaluations of the policy, and the serverreallocation time, which is the time taken to reallocate servers. In this paper we focus on howthe time taken to reallocate servers between applications affects the total revenue obtained bythe system.

Bottlenecks are resources that limit the overall performance of the system [5], thus they have asignificant impact on overall system performance, therefore it is desirable to avoid the bottleneck.The identification of bottlenecks is also important in tuning studies to evaluate the performancegains of different tuning alternatives [5]. Therefore substantial research has been conductedin bottleneck identification for multi-class closed queuing networks. However, it is non-trivialto predict or identify the system bottleneck as it can be shifted between tiers according to thechanges of the workload mix and depends upon the number of the jobs in the network [2]. In thispaper we use the approach developed in [5], where convex polytopes for bottleneck identificationin multi-class queuing networks have been used.

The specific contributions of this paper are:

• to evaluate the effects of switching duration on a dynamic resource allocation system;

• to review the behaviour of two known switching policies in this context;

• to examine the impact of these effects on revenue maximisation;

The remainder of the paper is structured as follows; section 2 reviews related work, section3 describes the model of the system and the revenue function. Sections 4 outlines the idea of asystem bottleneck and its identification methodology. Section 5 describes the admission controlsystem and the two switching policies used. Section 6 presents a description of the experimentalsetup and results, finally section 7 concludes the paper.

2 Related Work

Revenue maximisation is a key goal of many dynamic resource allocation systems. In [18] theauthors use priority queues to offer differentiated services to different classes of request to op-timise company revenue. Different priorities are assigned to differed requests based upon theircontributions to the revenue.

The work in [14] focussed on maximising profits of best-effort requests when combined withrequests requiring a specific quality of service (QoS) in a web farm. In [14] it is assumed thatarrival rates of requests are static, whist the arrival rates in our work are dynamic.

In [9] the authors attempt to maximise revenue by partitioning servers into logical pools andswitching servers at runtime. This paper differs from [9] as we consider switching in a multi-tierenvironment.

Our work in [20] presents the two server switching policies used here; the proportional switch-ing policy and the bottleneck aware switching policy. The policies are discussed in detail in section6. In our previous work a fixed switching time was selected based upon the real-world switchingsystem we developed in [7]. This work extends our previous work by examining the effects ofserver reallocation time on the revenue achieved by the system.

The work in [6] examines the effectiveness of admission control policies in commercial websites. A simple admission control policy was developed in our previous work [20] and is usedagain here.

Table 1: Notation used in this paper.Symbol DescriptionSir Service time of job class-r at station ivir Visiting ratio of job class-r at station iN Number of service stations in QNK Number of jobs in QNR Number of job classes in QNKir Number of class-r job at station imi Number of servers at station iφr Revenue of each class-r jobπi Marginal probability at centre iT System response timeDr Deadline for class-r jobsEr Exit time for class-r jobsPr Probability that class-r job staysXr Class-r throughput before switchingX

r Class-r throughput after switchingUi Utilisation at station its Server switching timetd Switching decision interval time

3 Modelling Multi-tiered Internet Services and RevenueFunctions

3.1 The System Model

A multi-tiered Internet service can be modelled using a multi-class closed queuing network[22][19]. The closed queuing network model used in this paper is illustrated in figure 1.

In a multi-class closed queuing network Sir represents the service time, which is defined asthe average time spent by a class-r job during a single visit to station i and vir symbolize thevisiting ratio of class-r jobs to station i (the notation used in this paper is summarised in table1).

Service demand Dir is defined in [13] as the sum of the service times at a resource over allvisits to that resource during the execution of a transaction or request (Dir = Sir · vir). Thetotal population of the network (K) is defined as the total population of customers of class r(Kr):

K =∑

r

Kr (1)

In modern enterprise systems, servers are often clustered together so both -/M/1-FCFS and-/M/m-FCFS in each station should be measured as a consequence of using a cluster of serversin each tier in our model. The mean response time of a class-r job at station i can be computedas follows [4],

T ir(k) =

Dir

[1 +

∑R

r=1Kir (k − 1r)

], mi = 1

Dir

mi

[1 +

∑R

r=1Kir (k − 1r)

+∑mi−2

j=0(mi − j − 1)πi (j | k − 1r)

], mi > 1

(2)

C

C DS

WS

WS

C AS

AS

AS

Figure 1: A Model of a Typical Configuration of a Cluster-based Multi-tiered Internet Service.

Where,

• there are k jobs in the queuing network, for i = 1, . . . , N and r = 1, . . . , R,

• (k - 1r) = (k1, . . . , kr - 1, . . . ,KR) is the population vector with one class-r job less inthe system.

The mean system response time Ti(k) is the sum of mean response time for each tier:

Ti(k) =R∑

r=1

Tir(k) (3)

For the case of multi-server nodes (mi > 1), it is necessary to compute the marginal proba-bilities. The marginal probability that there are j jobs (j = 1, . . . , (mi - 1)) at the station i,given that the network is in state k, is given by [4],

πi (j | k) =1

j

[R∑

r=1

vir

SirXr (k)πi (j − 1 | k − 1r)

](4)

The throughput of class-r jobs can be calculated using Little’s law [13] by dividing the totalpopulation of customers of class-r Kr by the sum of the visiting ratio vir, multiplied by the sumof mean response time of each tier,

Xr (k) =kr∑N

i=1virT ir (k)

(5)

By applying Little’s Law again with the Force Flow Law [13], the mean queue length Kir isobtained by multiplying the throughput Xr(k), the mean response time Tir(k), and the visitingratio vir.

Kir (k) = Xr (k) · T ir (k) · vir (6)

Where, Kir (0, 0 . . . , 0) = 0, πi (0 — 0) = 1, and πi (j — 0) = 0; the system responsetime, throughput and mean queue length in each tier can be calculated after K iterations.

In multiclass product-form queuing networks, the utilisation per-class station Uir(k) can becomputed using the following equation [15],

Uir(k) =krDir∑

iDir[1 +Ki(k − 1r)]

(7)

3.2 Modelling the Revenue Function

In [17] the session is defined as a sequence of requests of different types made by a single cus-tomer during a single visit to a site. Where a client request is met within the deadline themaximum revenue is obtained, while revenue obtained from requests which are not served withinthe deadline decreases linearly to zero, at which point the request exits the system.

Equation 8 explains how the probability function of the request execution in the system(which is donated by P (Tr)) works in our model, where r, Dr, Tr, and Er represent the requestand its deadline, response time, and dropped time from the system respectively.

P (Tr) =

1, Tr < Dr

Tr −Dr

Er −Dr, Dr ≤ Tr ≤ Er

0, Tr > Er

(8)

The first part of equation 8 states that the full revenue will be contributed by the requestif it is processed before the deadline Dr. It is clear from the second part of the equation thatthe gained revenue by the request is calculated by dividing the difference between the requestresponse time Tr and its deadline Dr by the difference between request dropped time from thesystem Er and its deadline Dr. The request gains no revenue when its response time Tr is greaterthan the time at which the request exits the system Er.

With respect to the probability of the request execution, the gained and lost revenue iscalculated. The loss revenue function, which is denoted as V i

loss, is calculated in equation 9,with the assumption that the servers are switched from pool i to pool j. Equation 10 is used tocalculate the gained revenue V i

gain. Note that because the servers are being switched, they cannot be used by both pools i and j during the switching process and the time that the migrationtakes cannot be neglected. The revenue gain from the switching process is calculated during theswitching decision interval time t

d as shown in equation 10 where the switching decision intervaltime is greater than the switching time.

V iloss =

R∑r=1

Xir(ki)φi

rP (Tr)td −R∑

r=1

Xi′r (ki)φi

rP (Tr)td (9)

V jgain =

R∑r=1

Xj′r (kj)φj

rP (Tr)(td − ts)−R∑

r=1

Xjr (kj)φj

rP (Tr)(td − ts) (10)

After calculating the achieved and lost revenue using equations 9 and 10 servers may beswitched between the pools. In this paper servers are only switched between the same tiers, andonly when the revenue gain is greater than the revenue lost.

4 Bottleneck

The work in [12] summarises a bottleneck resource as one for which: (i) short-term demandexceeds capacity; (ii) the work-in-process (WIP) inventory is at its maximum, which means thatthe number of waiting jobs in queue L=L(λ, µ) is at its highest; where the arrival rate of thejobs to a machine are denoted as λ, and µ represents its capacity or, (iii) production capacity is

at its minimum, relative to demand (i.e., the capacity utilisation which is represented as ρ= λ/µis at its maximum).

A bottleneck in the system may be shifted between tiers according to changes in the workloadmix and the number of jobs in the system [2]. It is clear that bottleneck identification should beone of the first steps in any performance study; any system upgrade which does not remove thebottleneck(s) will have no impact on the system performance at high loads [16].

A significant amount of research has been done trying to solve the problem of bottlenecks[2] [5] [8] [10] [11]. The work in [2] [8] [11] studies bottleneck identification for multi-class closedproduct-form queuing networks for an infinite population, while [5] [10] study a large population.

Our work in [21] uses the convex polytopes approach to identify the bottleneck in two differentpools for their chosen configuration using two classes of jobs (gold and silver). From the resultswe conclude that the bottleneck may occur at any tier and may shift between tiers, there is alsoa possibility that the system enters the crossover points region where more than one tier becomesa bottleneck. This method can compute the set of potential bottlenecks in a network with onethousand servers and fifty customer classes in just a few seconds.

5 Admission Control and Server Switching Policies

Overloading can cause a significant increase in the response time of requests, which leads toan obvious degradation in revenue. Admission control is a possible solution to the overloadingproblem. A simple admission control policy was developed in our previous work [20] and is usedagain in this research. This works through a simple policy of dropping less valuable requestswhen the response time exceeds a threshold value.

Due to the variation in demand for an online service, it is difficult to predict the workloadin the future. In a statically allocated system, comprised of many static server pools, a highworkload may exceed the capacity of the pool causing a loss in revenue, while lightly loadedpools may be considered as wasted resources if their utilisation is low. The policies which weexamine here are the proportional switching policy (PSP) and the bottleneck-aware switchingpolicy (BSP) that were developed in [21].

5.1 The Proportional Switching Policy

The proportional switching policy used here is shown in algorithm 1 and was first presented in[21]. This policy works by allocating servers at each tier proportionally according to workload,subject to an improvement in revenue.

5.2 The Bottleneck-aware Switching Policy

Many factors contribute to the performance of a system. The bottleneck-aware switching policyattempts to overcome some of the factors which impact the system negatively in order to improvethe results of both the static allocation and proportional switching policy. The bottleneck-awareswitching policy is a best effort algorithm; it may not find the optimal server allocation.

The bottleneck identification phase works if there is a bottleneck detected in either of thepools. If a bottleneck is detected at the same tier in each pool, migrating servers at that tierwill not remove the bottleneck. If a bottleneck exists within a single pool servers are migratedto remove the bottleneck, subject to a revenue improvement.

The local search algorithm (algorithm 3) works when there is no bottleneck saturation ineither pool. The algorithm uses a nested loop to evaluate server migrations starting from the

Algorithm 1 The proportional switching policyInput: N , mi, R, Kir, Sir, vir, φr , ts, tdOutput: Server configurationfor each i in N dom1

i /m2i = K1/K2

end forcalculate Vloss and Vgain using eq. 9 and eq. 10if Vgain > Vloss then

do switching according to the calculationsSir ← S

ir

elseserver configuration remains the same

end ifreturn current configuration

Algorithm 2 The bottleneck-aware switching policyInput: Nr, mi, R, Kir, Sir, vir, φr , ts, tdOutput: new configurationwhile bottleneck saturation found in one pool do

if found at same tier in the other pool thenreturn

elseswitch servers to the bottleneck tiermi ← m

i and Sir ← S′

ir

end ifend whilesearch configurations using Algorithm 3return current configuration

web tier to the application tier before finally evaluating the database tier. The revenue gain iscomputed at each stage, with the highest revenue state being chosen.

Due to the small number of classes in modern enterprise systems, solving the multi-class closedqueueing network is very quick [21]. The time complexity of the algorithm is O(m0m1m2), wherem0, m1, m2 are the total number of web, application and database servers in all pools.

6 Experimental Setup and Results

In this paper we simulate two applications which are running as two logical pools. Each applica-tion is multi-tiered, with each tier being comprised of a cluster of servers. Each application alsohas two classes of request, gold and silver, which are used to represent the value of each request.The service time Sir and the visiting ratio vir are chosen based on realistic values or from thosesupplied in supporting literature. Table 2 summarises the main experimental parameters whichare used.

The focus of the experimentation is to investigate how the time taken to reallocate serversaffects the revenue derived from the system. We have fixed the decision interval for the policiesat 60 seconds, and experimented with reallocation times of 5 to 55 seconds. We have conductedthese experiments under two inversely proportional workloads, as shown in figures 2 and 3.

Algorithm 3 The configuration search algorithmInput: Nr, mi, R, Kir, Sir, vir, φr , ts, tdOutput: best configurationInitialisation: compute U1

i , U2i

while U10 > U2

0 doif m2

0 > 1 thenm2

0 ↓, m10 ↑; S2

0r ← S2′

0r

while U11 > U2

1 doif m2

1 > 1 thenm2

1 ↓, m11 ↑; S2

1r ← S2′

1r

while U12 > U2

2 doif m2

2 > 1 thenm2

2 ↓, m12 ↑; S2

2r ← S2′

2r; compute Vloss using eq. 9S1

2r ← S1′

2r; compute Vgain using eq. 10if Vgain > Vloss then

store current configurationend ifcompute new U1

i , U2i

end ifend whilesimilar steps for U1

2 < U22

S11r ← S1′

1r; compute new U1i , U

2i

end ifend whilesimilar steps for U1

1 < U21

S10r ← S1′

0r; compute new U1i , U

2i

end ifend whilesimilar steps for U1

0 < U20

return best configuration

Under workload one both policies show that revenue decreases as server reallocation timeincreases. An increase in reallocation time, decreases the amount of time that the servers areavailable to service requests, thus reducing revenue obtained.

Figure 4 shows the results of the proportional switching policy under workload one. Thepolicy demonstrates a linear decrease in revenue, as the reallocation time increases from 5 to 30seconds. At 35 seconds there is a significant reduction in revenue, which increases slightly as thereallocation time increases.

The behaviour of the policy changes throughout the experiment, with the policy migratingmany servers when the reallocation time is small and fewer as the time increases. This is due tothe reallocation time being a consideration in the revenue gain as calculated by equation 10.

The proportional switching policy demonstrates an improvement in revenue over the staticallocation at all reallocation durations. The use of admission control has no effect on the revenuegenerated by the policy.

The bottleneck aware switching policy results are shown in figure 5. This policy most clearlyshows a linear relationship between the reallocation time and the revenue generated. The linearrelationship is preserved with or without the use of the admission control policy. The policydemonstrated significant improvements in revenue over a statically allocated system. Using the

Table 2: The main experimental parameters.Pool 1 Pool 2

Silver Gold Gold SilverService

time(sec)

WS 0.07 0.1 0.05 0.025AS 0.03125 0.1125 0.01 0.06DS 0.05 0.025 0.0375 0.025

Visitingratio

WS 1.0 0.6 1.0 0.8AS 1.6 0.8 2.0 1.0DS 1.2 0.8 1.6 1.6

Deadline (sec) 20 15 6 8Exit point (sec) 30 20 10 12

Revenue unit 2 10 20 4Number

ofservers

WS 4 5AS 10 15DS 2 3

admission control policy the system generates less revenue and our work in [21] outlines howaggressive admission control negatively affects a system under light load.

Under the second workload, the proportional switching policy (figure 6) performs worse thana static allocation at all intervals with the exception of 55 seconds. It should be noted howeverthat the maximum reduction in revenue is 4.77%.

Initially the policy behaves as expected, with a linear decrease in revenue, however the revenueincreases from 25 seconds as fewer servers are migrated due to the reduced revenue gained frommaking further migrations. In this scenario the use of admission control has a slight negativeimpact on the revenue obtained through the use of the policy.

The bottleneck-aware switching policy is the best performing policy under workload two. Itprovides a significant improvement in the revenue generated by the system. The improvementdecreases linearly to 30 seconds, where there is a drop at 35 seconds, before the revenue decreaseslinearly again. The large drop in revenue at 35 seconds is caused by an increase in the numberof server migrations from 12 at a 30 second duration to 15 at a 35 second duration.

Under the second workload the use of the admission control policy enhances the revenuegenerated at all reallocation times, however the enhancement is reduced when the reallocationtime increases beyond 30 seconds.

The findings from these experiments suggest that:

1. the reduction in revenue due to increased reallocation intervals generally holds over allserver switching policies;

2. that minimising the reallocation interval is therefore crucial;

3. that optimising the reallocation interval will be application specific; however, it amy bepossible to identify common traits (e.g. queue minimisation) that holds across a wide rangeof applications, which will provide a focus for future optimisation research.

50

100

150

200

0 5 10 15 20 25 30 35

Wor

kload

Time (mins)

Application 1Application 2

Figure 2: Workload One.

7 Conclusion

In this paper we have modelled an internet service provider as a collection of multi-class closedqueueing networks, each of which represents a three tier web application architecture with acluster of servers at each tier. Our model supports the dynamic reallocation of servers at thesame tier between pools.

We have evaluated the behaviour of two switching policies and compared them against astatic allocation under a range of reallocation intervals and observed that larger reallocationintervals have a negative impact on revenue.

The results under the first workload demonstrate a clear inversely proportional relationshipbetween reallocation interval and revenue. Under the second workload the relationship is welldefined.

In the short term we will investigate the impact of other aspects of dynamic resource alloca-tion. These include evaluating the decision interval which was fixed at 60 seconds in this work,and examining the combined effects of reallocation and decision intervals. We will also evaluatepossible optimisations to the switching process to ensure that the reallocation time is minimised.

Current policies evaluate the system retrospectively and make no predictions about the work-load after making a migration. In future we hope to make predictions based on historical workloadanalysis to better guide policies’ decisions.

References

[1] M. Arlitt and T. Jin. A Workload Characterization Study of the 1998 World Cup Web Site.IEEE Network, 14(3):30–37, 2000.

[2] G Balbo and G Serazzi. Asymptotic Analysis of Multiclass Closed Queueing Networks:Multiple Bottlenecks. Performance Evaluation, 30(3):115–152, 1997.

[3] P. Barford and M. Crovella. Generating Representative Web Workloads for Network andServer Performance Evaluation. SIGMETRICS Performance Evaluation Review, 26(1):151–160, 1998.

50

100

150

200

0 5 10 15 20 25 30 35

Wor

kload

Time (mins)

Application 1Application 2

Figure 3: Workload Two.

175

180

185

190

195

200

205

210

215

0 10 20 30 40 50 60

Total

Rev

enue

Switching Time (s)

NSPPSP no A.C.

PSP with A.C.

Figure 4: Revenue Generated by the Proportional Switching Policy Under Workload One atDifferent Reallocation Times.

[4] G. Bolch, S. Greiner, H. deMeer, and K. Trivedi. Queueing Networks and Markov Chains:modelling and performance evaluation with computer science applications. Wiley, 2nd edi-tion, 2006.

[5] G. Casale and G. Serazzi. Bottlenecks identification in multiclass queueing networks usingconvex polytopes. In 12th Annual Meeting of the IEEE Int’l Symposium on Modelling,Analysis, and Simulation of Comp. and Telecommunication Systems (MASCOTS), 2004.

[6] L. Cherkasova and P. Phaal. Session based admission control: a mechanism for peak loadmanagement of commercial web sites. IEEE Transactions on Computers, 51(6), 2002.

[7] A.P. Chester, W.J. Xue, L. He, and S.A. Jarvis. A system for dynamic server allocationin application server clusters. In International Symposium on Parallel and Distributed Pro-cessing with Applications, Sydney, Austrailia, December 2008.

170

180

190

200

210

220

230

240

250

260

0 10 20 30 40 50 60

Total

Rev

enue

Switching Time (s)

NSPBSP no A.C.

BSP with A.C.

Figure 5: Revenue Generated by the Bottleneck Aware Switching Policy Under Workload Oneat Different Reallocation Times.

166

168

170

172

174

176

178

0 10 20 30 40 50 60

Total

Rev

enue

Switching Time (s)

NSPPSP no A.C.

PSP with A.C.

Figure 6: Revenue Generated by the Proportional Switching Policy Under Workload Two atDifferent Reallocation Times.

[8] D.L. Eager and K.C. Sevcik. Bound hierarchies for multiple-class queueing networks. Journalof ACM, 33(1):179–206, 1986.

[9] L. He, W.J. Xue, and S.A. Jarvis. Partition-based Profit Optimisation for Multi-class Re-quests in Clusters of Servers. the IEEE International Conference on e-Business Engineering,2007.

[10] B.A. Huberman and S.H. Clearwater. Swing options: A mechanism for pricing peak indemand. Computing in economics and finance, HP Labs, 2005.

[11] Internet Trace Internet Traffic Archive Hosted at Lawrence Berkeley National Laboratory.http://ita.ee.lbl.gov/html/traces.html, 2008.

[12] S.R. Lawrence, , and A.H. Buss. Economic analysis of production bottlenecks. MathematicalProblems in Engineering, 1(4):341–363, 1995.

[13] M. Litoiu. A performance analysis method for autonomic computing systems. ACM Trans-actions on Autonomous and Adaptive Systems (TAAS), 2(1):3, 2007.

150

200

250

300

350

400

450

0 10 20 30 40 50 60

Total

Rev

enue

Switching Time (s)

NSPBSP no A.C.

BSP with A.C.

Figure 7: Revenue Generated by the Bottleneck-aware Switching Policy Under Workload Twoat Different Reallocation Times.

[14] Z. Liu, M.S. Squillante, and J.L. Wolf. On maximizing service-level-agreement profits. ACMSIGMETRICS Performance Evaluation, 29(1):43–44, 2001.

[15] J.K. MacKie-Mason and H.R. Varian. Pricing congestible network resources. IEEE Journalon Selected Area in Communications, 13(7):1141–1149, 1995.

[16] M. Marzolla and R. Mirandola. Performance prediction of web service workflows. In Thethird International Conference on the Quality of Software-Architectures (QoAS), volume4880, pages 127–144, Medford, MA, USA, 2007.

[17] D.A. Menasce. Using performance models to dynamically control e-business performance.In International Multiconference on Measurement, Modelling, and Evaluation of Computer-Communication Systems, pages 1–5, Aachen, Germany, 2001.

[18] D.A. Menasce, V.A. Almeida, R. Fonseca, and M.A. Mendes. Business-oriented resourcemanagement policies for e-commerce servers. Performance Evaluation, 42(2-3):223–239,2000.

[19] J. Rolia, X. Zhu, M. Arlitt, and A. Andrzejak. Statistical service assurances for applica-tions in utility grid environments. Modeling, Analysis and Simulation of Computer andTelecommunications Systems (MASCOTS), pages 247–256, 2002.

[20] W.J. Xue, A.P. Chester, L. He, and S.A. Jarvis. Dynamic resource allocation in enter-prise systems. In International Conference on Parallel and Distributed Systems, Melbourne,Australia, December 2008.

[21] W.J. Xue, A.P. Chester, L. He, and S.A. Jarvis. Model-driven server allocation in distributedenterprise systems. In The 3rd International Conference on Adaptive Business InformationSystems (ABIS’09), Leipzig, Germany, March 2009.

[22] J.Y. Zhou and T. Yang. Selective early request termination for busy internet services. In15th International Conference on World Wide Web, 2006.