BrownMap: Enforcing Power Budget in Shared Data Centers

RI09016 , 17 December 2009 Systems Management

IBM Research Report

BrownMap: Enforcing Power Budget in Shared Data Centers

Akshat Verma, Pradipta De, Vijay Mann, Tapan Nayak,

Amit Purohit, Gargi Dasgupta, Ravi Kothari

IBM Research Division

IBM India Research Lab

Vasant Kunj Institutional Area Phase -2

New Delhi - 110070, India.

IBM Research Division

Almaden - Austin - Beijing - Delhi - Haifa - T.J. Watson - Tokyo - Zurich

LIMITED DISTRIBUTION NOTICE: This report has been submitted for publication

outside of IBM and will probably be copyrighted is accepted for publication. It has

been issued as a Research Report for early dissemination of its contents. In view of

the transfer of copyright to the outside publisher, its distribution outside of IBM prior

to publication should be limited to peer communications and specific requests. After

outside publication, requests should be filled only by reprints or legally obtained

copies of the article (e.g., payment of royalties). Copies may be requested from IBM

T.J. Watson Research Center, Publications, P.O. Box 218, Yorktown Heights, NY

10598 USA (email: [email protected]).. Some reports are available on the

internet at http://domino.watson.ibm.com/library/CyberDig.nsf/home .

BrownMap: Enforcing Power Budget in Shared Data Centers

Akshat Verma Pradipta De Vijay Mann Tapan Nayak Amit PurohitGargi Dasgupta Ravi Kothari

Abstract

Shared data centers that use virtualized servers as build-ing blocks (e.g., clouds) are emerging as an excitingtechnology that provides a computational platform forcustomers without incurring a steep setup cost. Theproviders of a shared data center, however, need intelli-gent mechanisms to deal with bursts in resource require-ments as well as failures that may significantly reduceresource availability. In this work, we investigate mech-anisms to ensure that a shared data center can operatewithin a power budget, while maximizing the overall rev-enue earned by the provider. We present theBrownMapmethodology that is able to ensure that data centers candeal both with outages that reduce the available poweror with surges in workload.BrownMapuses automaticVM resizing and Live Migration technologies to ensurethat overall revenue of the provider is maximized, whilemeeting the budget. We implementBrownMapon anIBM Power6 cluster and study its effectiveness using atrace-driven evaluation of a real workload. Both theo-retical and experimental evidence are presented that es-tablish the efficacy ofBrownMapto maximize revenue,while meeting a power budget for shared data centers.

1 Introduction

The inevitability of power outages in a power grid dueto the complex nature of the grid is slowly being real-ized [4]. In the year 2007, more than 100 significantoutages had been reported in North America [21]. Thegap between the creation of new power generation unitsand the increasing growth-driven demand in emergingeconomies make the problem even more acute in devel-oping countries. A 2008 report by Stratfor [29] indi-cates that the growth in GDP outpaces growth in powerproduction in China by a factor of5 and in India by afactor of 2. As a result of the power shortages, enter-prises depend on back-up generators to deal with out-ages. However, the backup power available is typicallymuch smaller than the demand leading to another electri-cal condition called powerbrown-outs, i.e., reduction in

the power available to a data center.

A brownout is a temporary interruption of power ser-vice in which the electric power is reduced, rather thanbeing cut as is the case with a blackout. For example,the voltage drops from 120 watts to 98 watts or less thatare available to IT equipment. Major metropolitan citiesin the developed world as well experience brownoutsdue to lower power availability. In the year 2007,33power outages happened in North America leading toa reduction in the available power supply [21]. Over-loads on the electrical system or natural calamities likestorms etc. can disrupt the distribution grid, triggering abrownout. These can last anywhere between minutes toa few hours depending on their severity. In developingcountries like India, brownouts may last for days due toinsufficient energy supply especially during the summerseason. In some cases, a brownout is actually deliber-ate, when voltage reductions are undertaken when it issensed that a disruption in the grid may lead to seriousproblems. Rather than instituting rolling blackouts, theelectricity distribution company may temporarily reducethe voltage to some customers in an attempt to preventa collapse of the grid and to allow reserves of power toaccumulate again.

Some surveys [1] predict that unless corrective actionsare taken, power failures and limits on power availabil-ity will affect data center operations at more than90% ofall companies within the next five years. Power gridshave started differential pricing to deal with low fre-quency problems. For example, the KSE grid in Indiacharges Rs570 per KiloWattHour (KwH) for any addi-tional unit consumed over and above the allocated quotaif the supply frequency dips below49 Hz whereas theregular charges at50 Hz are around Rs4/KwH [31].Hence managing the impacts of a brown-out and recom-mending actions to deactivate appropriate services withthe least financial impact to Data Center business is crit-ical. The ability to gracefully deal with brownouts alsogive Data Centers the opportunity to enhance their imageas good corporate citizens.

Creating a power budget for an ensemble of blade [24]

1

https://www.researchgate.net/publication/4244659_Ensemble-Level_Power_Management_for_Dense_Blade_Servers?el=1_x_8&enrichId=rgreq-83888c08-da8b-4a46-9544-eda007dedf1d&enrichSource=Y292ZXJQYWdlOzIyMTQ2MTI1ODtBUzoxMDE3MzI3MTM1MDA2NzhAMTQwMTI2NjM3OTY5MA==

or rack servers [14] has been addressed earlier. The goalin these cases is to find the aggregate peak power con-sumption and use it as a budget. If the actual powerexceeds the estimated budget, a throttling mechanism isused at each server. The emergence of virtualization indata centers make such application-unaware server throt-tling of limited use. Multiple virtual machines runningapplications with different SLAs (and utility) may be co-located on a common server and a server-based powermanagement mechanism is unable to differentiate be-tween these applications. Further, server throttling doesnot leverage virtual machine migration, which is an ef-fective tool for power management. Finally, a brownoutmay significantly reduce the available power to a datacenter and due to the limited dynamic power range of aserver [33, 32], throttling may not be able to reduce therequired amount of power.

1.1 ContributionBrownMap is designed for data centers to deal withbrownout scenario, where a substantially lower powerbudget may be available due to a power outage. It canalso help a data center to deal with power surges thatarise due to workload variability. The contribution of ourwork is two-fold:(i) We present the design and implementation of a run-time BrownMappower manager that helps a shared datacenter deal with brownouts. The power manager usesa distributed monitoring and reconfiguration framework.Our design has minimal monitoring overheads and re-configures the server cluster to meet the power budgetin a very short duration. TheBrownMaparchitecture isrobust enough to deal with noise in monitored data andscales well with the size of the server cluster.(ii) We present theBrownMapplacement methodologyto find the configuration that maximizes the utility earnedby the applications, while meeting the power budget. Themethodology uses a novel divide and conquer method-ology to break the hard power budgeting problem intoServer Selection, VM Resizing andVM Placementsub-problems. We use a iterative procedure that leveragesthe nature of the power and utility curves to find a con-figuration that is close to the optimal for most practicalscenarios. On a real testbed using production traces, weshow thatBrownMapmeets the reduced power budget inthe order of minutes and can bring the power down byclose to50% for a10% drop in utility.

2 Model and Preliminaries

We now formally define the brownout problem addressedin this paper.

We consider a shared data center withN applications

Ai hosted onM serversSj . Each application is runin a dedicated virtual machine (VM) or logical partition(LPAR) and has a utility value that is a function of the re-source allocated to the LPAR (Utility(xi)). We use theterms VM and LPAR interchangeably in this work. Thedata center experiences a brownout for the nextT hourswith the available power reduced toPB. The goal of thePower Manageris to re-allocate resources to each appli-cations, migrate applications (virtual machines) betweenservers, and switch servers to low power states in orderto meet the power budget. Further, we need to ensurethat the utility is maximized while meeting the budget.Formally, we need to find an allocation (or VM Size)xi

for each applicationAi and a mappingyji on each server

Sj s.t.

arg maxx,y

N∑

i=1

Utility(xi) (1)

M∑

j=1

Powerj(x,y) < PB, ∀Sj

N∑

i=1

yji xi ≤ Cj , ∀Ai

M∑

j=1

yji = 1

(2)whereUtility(xi) is the utility earned by applicationAi

if it is assignedxi resources,Powerj(x,y) is the powerdrawn by serverSj for the given resource assignment andplacement, andCj is the capacity of the serverSj . Var-ious benchmarks exist to capture the capacity of variousserver models. In this work, we use the IDEAS RPE2value of a server to denote its capacity [27].

2.1 Deriving VM Utility for multi-tier ap-plications

x

i

THROUGHPUT

RESO

URCE

UTIL

ITY

UTILIT

Y FUNCTI

ON

RESOURCE2

RESOURCE1

iT

x 1

2

U

Figure 1: Utility Computation for multi-tier applications

We have used a utility maximization framework to cap-ture the importance of each application to the shared datacenter. The utility may be computed based on the rev-enue earned by the data center by each application. Autility based framework is general enough to capture var-ious other objectives like strict priority and fairness.

Our framework assigns resources to each VM andhence needs a utility function to be associated to individ-ual virtual machines. However, many data centers runservices composed of multiple applications, which mayagain be multi-tiered and run from two or more LPARs.A service can easily be broken down into the applications

2

https://www.researchgate.net/publication/221461239_pMapper_Power_and_Migration_Cost_Aware_Application_Placement_in_Virtualized_Systems?el=1_x_8&enrichId=rgreq-83888c08-da8b-4a46-9544-eda007dedf1d&enrichSource=Y292ZXJQYWdlOzIyMTQ2MTI1ODtBUzoxMDE3MzI3MTM1MDA2NzhAMTQwMTI2NjM3OTY5MA==

https://www.researchgate.net/publication/221235483_Power-aware_dynamic_placement_of_HPC_applications?el=1_x_8&enrichId=rgreq-83888c08-da8b-4a46-9544-eda007dedf1d&enrichSource=Y292ZXJQYWdlOzIyMTQ2MTI1ODtBUzoxMDE3MzI3MTM1MDA2NzhAMTQwMTI2NjM3OTY5MA==

https://www.researchgate.net/publication/221236039_A_Performance-Conserving_Approach_for_Reducing_Peak_Power_Consumption_in_Server_Systems?el=1_x_8&enrichId=rgreq-83888c08-da8b-4a46-9544-eda007dedf1d&enrichSource=Y292ZXJQYWdlOzIyMTQ2MTI1ODtBUzoxMDE3MzI3MTM1MDA2NzhAMTQwMTI2NjM3OTY5MA==

and the utility for each application can be estimated. Forexample, an e-Commerce site would have applicationsfor browsingand shopping. These applications wouldtypically run on separate set of VMs in a large data centerfor ease in application support and maintenance. Hence,it is relatively straightforward to assign utility to eachapplication, which serves exactly one type of requests.However, utility functions are still defined for the com-posite application and not for each tier separately. Atypical3-tiered application has a front end web server, amiddle tier application server and a backend DB server.In order to apply our framework, we need to derive theutility functions for each application component (or VM)from the utility function of the composite application.

We have designed a proportional utility assignmentmethod to derive the utility for each application compo-nent. Consider an example scenario with a 2-tiered ap-plication in Fig. 1. TheUTILITY function captures theutility derived by the composite application for a giventhroughput. TheRESOURCE function captures theresource consumption of each tier (or VM) for a cer-tain application throughput. Resource functions for eachtier can be obtained using monitored data that map re-source utilization in each component VM to the applica-tion throughput. In cases where an application may havemultiple types of requests, the functions are based on av-erage estimates.

In order to apply our methodology, we derive the utilityfor each VM from the applicationUTILITY functionandRESOURCE function for the VM in the follow-ing manner. We divide the utility derived by the applica-tion to its components in proportion to the resource used.Hence, in Fig. 1, we set the utility ofComponent1 atthroughputTi as x1

x1+x2

∗ Ui. The above assignment en-sures that the sum of the derived utility of all componentsof an application at a given throughput equals the actualutility of the application at the throughput.

For multi-tier applications, it is also desirable that theresource assigned to each component by any optimiza-tion methodology should be such that no one compo-nent becomes a bottleneck. To consider the example inFig. 1 again, if LPAR1 has a resource assignedx1 andLPAR2 has resource assigned greater thanx2, any ad-ditional resource assigned to LPAR2 abovex2 does notlead to an increase in the throughput of the application.This is because the application is limited by the lowestresource allocation among all components (in this caseRESOURCE1). We will later show (Sec. 4) how ouroptimization methodology satisfies this proportional as-signment property as well.

3 BrownMap Architecture

In this section, we describe the overall architecture of theBrownMap Power Manager.

BrownMap Power Managercomputes a new sizing andplacement for the Virtual Machines (VMs) for a fixed du-ration termed as the consolidation interval (e.g., 2 hours).The key modules in the BrownMap Power Manager, asshown in Fig. 2, are (i)Monitoring Engine, (ii) WorkloadPredictor, (iii) Profiling Engine, (iv) Placement Gener-ator, and (b)Reconfig Manager. The Monitoring En-gine periodically collects (a) system parameter valuesfrom each logical partition (LPAR) as well as the Vir-tual I/O partition present on each physical server in theshared data center and (b) Application usage statistics,and stores them in the Monitor Log. Thepower manage-mentflow is orchestrated by aController. TheControllerexecutes a new flow on an event trigger, which could beeither a change in the power budget or the end of the pre-vious consolidation interval. On receiving an event trig-ger, it invokes theWorkload Predictor, Profiling Engine,Placement Generator, andReconfiguration Managerinthe given order to coordinate the computation of the newconfiguration and its execution. The main steps in theflow of the BrownMap technique are (i) estimation ofthe resource demand for each VM in the next consolida-tion interval by theWorkload Predictor. (ii) updation ofthe VM resource demands to account for VIO resourceusage based on the profiles of each application by theProfiling Engine, (iii) re-sizing and placement of VMsbased on their utility models and a power budget by thePlacement Generator, (iv) execution of the new configu-ration by theReconfig Manager. We now describe eachcomponent of our architecture separately.

3.1 Monitoring EngineThe monitoring engine periodically collects resource us-age statistics for all partitions, including the managementpartition, on a physical node. For IBM’s Power Hyper-visor (pHyp) the management partition is called the Vir-tual I/O (VIO) Server. The monitor agent on each par-tition collects utilization data for CPU, active memory,network traffic, and I/O traffic, and feeds it back to themonitoring engine. The data is sampled every 30 sec-onds, and periodically the aggregate log files are pushedto aMonitor Log, implemented as a relational database.The monitored data can be accessed from this databaseby the other modules. The script based monitor agentrunning on each partition is low overhead, and consumesless than 0.1% of the resources allocated to a partition.

Monitoring the resource usage in a virtual partition ischallenging because the resources dedicated to the parti-tion can change dynamically under certain settings. For

3

VM VMVIO

VM VMVIO

WorkloadPredictor

Profile DatabaseProfiling Engine

Power Budget

Utility Model

Server Farm

Server−2

Server−1

Monitor Log

CONTROLLER

PlacementGenerator

TriggerBrownOut Power Manager

Monitoring Engine

Raw VM Demand

Demand

Real VM

New Configuration

Reconfig Manager

(i)(ii)

(iii)

(iv)

Figure 2: BrownMap Power Management Architecture

example, the number of CPUs allocated to a partitioncan vary over time. Therefore, the monitor agent musttrack the resources currently used by the partition, whichis called the entitlement of the partition. The actual re-source usage, for CPU and memory, is percentage uti-lization with respect to the entitlement. Note that theCPU entitlement for each LPAR in a Power6 virtual envi-ronment can be a fraction of the total CPU pool availableon the physical server. Many of the modern processorsare also capable of scaling the voltage and frequency inorder to optimize power consumption, which is knownas Dynamic Voltage and Frequency Scaling (DVFS). IfDVFS is enabled, the reported entitlement takes into ac-count the scaled CPU frequency. Active memory statis-tics reports changes in the memory utilization while anapplication is executing. Network statistics collects dataon number of bytes and packets transferred. I/O statis-tics collects data for the disk activities, which could bestorage attached over SAN. The resource usage statisticsof the VIO are similarly monitored and logged. It is im-portant to note that the VIO performs work on behalf ofindividual LPARs, which should be accounted back tothe LPAR. The profiling engine, discussed later ensuresthat the LPAR resource usage captures the work done byVIO on its behalf.

3.2 Workload PredictorThe goal of theWorkload Predictoris to estimate the rawresource (CPU, memory, network, disk) demand for eachVM in the next consolidation interval. It has been ob-served in data centers that some workloads exhibit niceperiodic behavior and some do not follow any particu-lar pattern [34]. Periodic workloads can be predicted in

longer horizons with significant reliability using a longterm forecast. However, the prediction error increasesrapidly for non-periodic workloads as the horizon in-creases. We use a two-pronged strategy in the design ofour Predictor to handle both periodic and non-periodicworkloads. We make a short-term as well as a long termprediction and use both to estimate the resource usage.

A popular method for deciding periodicity is the auto-correlation function and the peaks in the magnitude spec-trum [15, 3]. Once the periodicity for a workload is de-termined, we use it to make a long term forecast. Theshort-term prediction is based on polynomial approxima-tion to minimize the least-square error. We then use aweight parameter to give weightage to the long term andshort term forecasts. Let us divide the usage history intoa sequence of time periods and we consider the lastPperiods for estimation. Our goal is to forecast the nextK usage values based on lastn samples of the usage his-tory, wheren = P ∗p, p is the number of samples in theestimated time period. The resource demand at(n+k)-thinterval is predicted as

D̂n+k = (1−α)∗1

P

P∑

i=1

yi∗p+k + α∗fp(yn, yn−1, . . . , yn−Ns−1), k = 1

(3)where yi∗p+k are the corresponding usage values atthe kth sample of thei-th cycle, fp is the shortterm prediction based on lastNs samples andαis the weight parameter. Note that1

P

∑P

i=1yi and

fp(yn−1, yn−2, . . . , yn−Ns) represent the long and short-

term components of the forecasted value, respectively.We setα as1 for workloads without periodicity and0.5otherwise. For the resource usage histories we consid-ered, we found that second order is sufficient for rea-

4

https://www.researchgate.net/publication/4256714_Dynamic_Placement_of_Virtual_Machines_for_Managing_SLA_Violations?el=1_x_8&enrichId=rgreq-83888c08-da8b-4a46-9544-eda007dedf1d&enrichSource=Y292ZXJQYWdlOzIyMTQ2MTI1ODtBUzoxMDE3MzI3MTM1MDA2NzhAMTQwMTI2NjM3OTY5MA==

https://www.researchgate.net/publication/37877248_Time_Series_Analysis_Forecasting_And_Control?el=1_x_8&enrichId=rgreq-83888c08-da8b-4a46-9544-eda007dedf1d&enrichSource=Y292ZXJQYWdlOzIyMTQ2MTI1ODtBUzoxMDE3MzI3MTM1MDA2NzhAMTQwMTI2NjM3OTY5MA==

https://www.researchgate.net/publication/234784433_Server_workload_analysis_for_power_minimization_using_consolidation?el=1_x_8&enrichId=rgreq-83888c08-da8b-4a46-9544-eda007dedf1d&enrichSource=Y292ZXJQYWdlOzIyMTQ2MTI1ODtBUzoxMDE3MzI3MTM1MDA2NzhAMTQwMTI2NjM3OTY5MA==

sonable approximation, and the error increases asK in-creases. Finally, the default value ofP is set such thatthe number of periods cover a week, which has been ob-served to be sufficient to determine periodicity in datacenters [34].

3.3 Profiling EngineI/O processing for virtual I/O adapters are performed bythe Virtualization layer (Virtual I/O Server or VIO inpHyp and dom0 on Xen) on behalf of individual LPARs.This redirection leads to a CPU overhead for I/O pro-cessing [8] that needs to be accounted back to the LPARperforming the I/O. To take an example, the CPU over-head in VIO due to an LPAR having a network activity of600KBps on an IBM JS-22 BladeCenter is around4%of 1 4.0GHz Power6 core (Fig 3(b)). TheProfiling En-gineprovides an estimates of these VIO CPU overheadsand accounts it to the LPAR in order to create the realresource demand for each LPAR.

TheProfiling Engineuses aProfile Databasethat cap-tures the relationship between hypervisor CPU overheadand the disk and network activity in an LPAR for eachphysical server type (the two curves in Fig 3). This pro-file is created using calibration runs on each server modelin the datacenter. During thepower managementflow,theProfiling Engineuses theProfile Databasealong withthe network and disk activity demand provided by theWorkload Predictorto estimate the VIO overhead due toeach LPAR. The VIO overhead is then added to the rawCPU demand of the LPAR to estimate the real resourcedemand for each LPAR in the consolidation interval. Thereal demand is used by thePlacement Enginefor VMsizing and placement.

The second job of theProfiling Engineis to providean estimate of the power for any configuration that thePlacement Generatorcomes up with. The power drawnby a server can not be accurately estimated only fromthe expected CPU utilization of the server for heteroge-neous applications, as it also depends on the nature of theapplication [32]. TheProfiling Engineuses WattApp, apower meter that has been designed for heterogeneousapplications [18]. WattApp uses power profiles for eachapplication on a server, which is then used to estimate thepower drawn by a server for running a mix of applica-tions. The individual power profile for each applicationis also stored in theProfile Database.

3.4 Placement GeneratorThe Placement Generatortakes as input the predictedreal resource demand (CPU, Memory, I/O) and the appli-cation profiles from theProfiling Engine, utility modelsfor the workloads, the previous allocation and the user

specified power budget. Based on resource demands andthe utility accrued from each application, it computes aa new placement map, which specifies which applica-tions reside on which servers and occupy what capacity(i.e size) of the host server. Further, based on the ap-plication and server profiles, theV IO layer is also re-sized. The power consumed by this new placement mapis now within the power budget and maximizes the over-all utility. The details of the placement methodology im-plemented by thePlacement Generatorare detailed inSec. 4.

3.5 Reconfiguration ManagerThe Reconfiguration Managertakes as input the newLPAR entitlements and placement provided by thePlacement Engineand moves the data center to thisnew configuration in the most efficient manner possible.LPAR migration is an expensive operation (1 to 2 min-utes for active LPARs) and it is important to minimize thetime taken to reconfigure the data center. Further, sinceLPARs may both move in or out of a server, it is possi-ble that there may not be available resources for an LPARmoving in till some LPARs are moved out. This may cre-ate temporary resource capacity issues leading to failuresduring reconfiguration. The goal of theReconfigurationManager is to (a) minimize the total time taken to re-configure and (b) avoid any temporary capacity failuresdue to migration. TheReconfiguration Managerspawnsa new process for each server that has at least one LPARbeing migrated from it. This allows the reconfigurationto scale with the number of servers involved in the re-configuration. In order to overcome any capacity issuesduring migration, we reduce the allocations on LPARsbefore starting the migration. This ensures that there isno resource allocation failure when an LPAR is migrat-ing into a physical node. The LPAR is resized back to itscorrect resource allocation, as soon as the target serverhas finished migrating any LPARs that are moving outof it. Once a blade has no LPARs running on it, theRe-configuration Managereither switches off the blade ormoves it toNap mode [13], if available on the server .

4 BrownMap Sizing and PlacementMethodology

We now present theBrownMap sizing and placementmethodology implemented by thePlacement Engine.The problem of allocating resources to each VM in or-der to meet a power budget, while maximizing the over-all revenue (Eqn. 2), requires one to solve three prob-lems at the same time: (i) Selection of active servers thatmeet the power budget, (ii) Resource Allocation for eachVM (xi) on the available capacity of the servers and (iii)Placement of VMs on active physical servers (yj

i ). One

5

https://www.researchgate.net/publication/234784433_Server_workload_analysis_for_power_minimization_using_consolidation?el=1_x_8&enrichId=rgreq-83888c08-da8b-4a46-9544-eda007dedf1d&enrichSource=Y292ZXJQYWdlOzIyMTQ2MTI1ODtBUzoxMDE3MzI3MTM1MDA2NzhAMTQwMTI2NjM3OTY5MA==

0

2

4

6

8

10

12

14

0 20 40 60 80 100 120 140

CP

U O

verh

ead

in V

IO S

erve

r

KBPS | TPS/10

KBPSTPS

Model

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

0 100 200 300 400 500 600 700 800

CP

U O

verh

ead

in V

IO S

erve

r

KBPS

CPU Overhead Vs Network

(a) (b)

Figure 3: VIO Overhead with (a) LPAR Disk Activity and (b) LPAR network activity

may observe that the resulting problem is an NP-hardproblem, as bin packing becomes a simple case of thisproblem.

In order to understand the problem, observe that theoptimization problem evaluates two functions, namelyPower andUtility as a function of VM sizes (xi) andtheir placement (yj

i ). As noted in [32], power consump-tion by a server is not determined solely by the server’sutilization but is also dependent on the nature of applica-tions running on the server. Hence, estimating the powerdrawn by a heterogeneous mix of applications in a shareddata center is a challenging problem. Since a closed formfor an objective function namely power does not exist,off the shelf solvers can not be used. The Utility func-tion is again a derived measure and is dependent on theSLA parameter of the application. Typical utility func-tions would satisfy the law of diminishing marginal re-turns and may be approximated by a concave curve. Thevalue of the SLA parameter depends on the resource as-signed to the application and is typically a convex func-tion (e.g, Fig. 5(b)). Hence, the nature of the utility curvewith the resource is a function with potential points of in-flection making it a difficult problem to solve.

We now present an outline of ourBrownMapmethod-ology that solves this problem using an iterative proce-dure.

4.1 Outline of BrownMap

SERVER SELECTION

POWER BUDGET

VM SIZING

AVAILABLE CAPACITY

VM PLACEMENT

VM SIZESSERVEROPERATINGPOINTS

POWERBUDGET

MET?

PLACEMENT

NO

REDUCED

POWERBUDGET

END

Figure 4: BrownMap Placement Algorithm FlowTheBrownMapmethodology is based on a ’divide and

conquer’ philosophy. We divide this hard problem into

three sub-problems, namely Server Selection, VM Siz-ing and VM Placement. One may not that the feasiblespace for each sub-problem depends on the solution ofother sub-problems. For example, theServer Selectionproblem can predict whether the power budget can bemet, only if the placement and sizing of the applicationsare known. To address this problem, we assume a fixedsolution for the other two sub-problems, while findingthe best solution for each sub-problem. We then iterateover the solution till we can find a placement and sizingthat meets the power budget. Our methodology lever-ages a recent power modeling work Wattapp [18], whichis able to predict the power drawn by a heterogeneousmix of applications running on a shared data center, forconvergence. The overall iterative flow ofBrownMap, asshown in Fig. 4, consists of the following steps.

• Server Selection: In this step, we use the avail-able power budget to pick the most power efficientservers within the power budget for an average ap-plication. We leverage the Order Preservation prop-erty from an earlier work [32] that allows us to rankservers with respect to power efficiency in an ap-plication oblivious manner. Once we identify theactive servers, we compute the aggregate server ca-pacity available to the applications.

• VM Sizing: This step takes the available capacity asinput and computes the best VM size (xi) for eachapplication that maximizes the utility earned withinthe available capacity.

• VM Placement: In this step, we use the servers andtheir operating points and VM sizes to compute thebest placement (yj

i ) of VMs on servers. We use thehistory-aware iDFF placement method presented inan earlier work [32].iDFF is a refinement of theFirst Fit Decreasing bin packing algorithm that alsominimizes the number of migrations. For furtherdetails of this step, the reader is referred to [32]

• Iterate: If the Placement method is able to place allthe applications, we use WattApp [18] to estimatethe power drawn. If the estimated power exceedsthe budget or the applications can not be placed in

6

the previous step, we iterate from the server selec-tion step with a reduced set of servers.

4.2 Server SelectionWe define theServer Selectionproblem as finding a sub-set of servers and their operating points such that thepower drawn by the servers is within a power budget.Further, the total server capacity available for hostingVMs is the maximum possible within the power budget.Note that the power drawn by a server is not determinedsolely by the CPU utilization but also on the applicationmix running on the server [10], [26], [32], [33], [18].Hence, the power drawn by a set of servers can not berepresented as a closed function of the capacity used, as itdepends on factors other than server capacity. However,we have noted in [32] that an ordering can be establishedbetween different servers based on their power efficiencyfor any fixed application and this ordering holds acrossapplications.

We use theOrdering propertyto order servers and cre-ate a power vs capacity curve in Fig. 5(a). We replace theoriginal curve with a convex approximation and find theserver capacity that meets the power budget. The convexapproximation is employed for the iterative convergence.We use this selection of servers and operating points forVM sizing and placement. Once we get an allocation ofVMs to servers and their sizes, we estimate the actualpower consumed by the servers using the WattApp meter[18]. If the estimated power drawn exceeds the powerbudget, we move down on the convex power curve anditerate again.

4.3 VM SizingThe VM Sizing problem takes as input an aggregateserver capacity available to host a set of VMs. Further,for each VM, a model of utility as a function of the SLAparameter of the VM and a model of SLA parameter ver-sus resource assigned to the VM is taken from the Pre-Processing step. We use the utility and resource modelsto create a model of utility versus resource allocated foreach server. We startVM Sizingby allocating to each VMthe maximum resource required by it in the next consol-idation interval. We then iteratively take away resourcesfrom the VM with the least slope of the Utility-Resourcecurve (or the VM which has the least drop in utility fora unit decrease in resources allocated). To take the ex-ample in Fig. 5(b), we first take away resources from thefirst VM (with the least slope). In the next step, the thirdVM has the least slope and we reduce its resource alloca-tion. TheVM sizingmethod terminates when (a) the totalcapacity used by the VMs equals the capacity given bythe server selection process and (b) the selected point oneach curve is either a corner point or the slope of all the

non-corner points are equal (Fig. 5(b)). The first prop-erty ensures that we meet the power budget assuming theServer Selectionprocess is accurate. The second prop-erty ensures that the overall utility drawn by the VMswithin the power budget can not be increased by makingsmall changes. It is easy to see (proof omitted due to lackof space) that the resultant solution has the following op-timality property.

Theorem 1 The VM Sizing Method finds a resource al-location that is locally optimal. Further, if the utility ver-sus capacity models for all VMs are concave, then theresource allocation is globally optimal.

We note that the VM sizing for one VM is independentof other VMs. Hence, if an application has multiple com-ponents, with each component hosted in a separate VM,the sizing may lead to resource usage. A desirable prop-erty is that each component of an application should besized to achieve the same throughput. In order to achievethis, we add another condition to the convergence. If twoVMs compete for a resource and have the same slopeon the utility-capacity curve, we assign the resource tothe VM with a lower achieved SLA value. This propertycoupled with the utility breakup for multi-tier applica-tions (Sec. 2.1) leads to the following proportional allo-cation property between VMs belonging to a multi-VMapplication.

Property 1 Proportional Allocation Property: For amulti-tier application, the VM sizing allocates resourcesto all the components of an application in a way that theylead to the same application throughput.

4.4 Iterative Procedure

The Iterative proceduretakes the computed placementand uses theWattApppower meter to estimate the powerconsumed by the placement. If the estimated powermeets the power budget, theReconfiguration Manageristriggered to reconfigure the data center. If the estimatedpower is more than the budget, we iterate fromServerSelectionwith a lower budget.

The Brownout problem has two optimization objec-tives: (i) power minimization and (ii) utility maximiza-tion. Minimization problems that have a convex objec-tive function and maximization problems with a concaveobjective function lead to a fractional optimal solutioneasily. Hence, we have converted the power curve in theServer Selectionproblem to a convex approximation. Incase the utility-capacity curve for all the applications isconcave, this implies that the iterative procedure wouldconverge to a solution that minimizes power and max-imizes utility for a fixed server capacity. Further, the

7

Available Capacity

Capacity

Pow

er

Power Budget

Original Power Curve

Convex Approximation

ALLOCATED

Utility

Resource Allocated

Utility

Resource Allocated

Utility

Resource Allocated

13

5

4

2

x

x

x

1

2

3

ORIGINALDEMAND

(a) (b)Figure 5: (a) Server Selection and (b) VM Sizing Method

solution is also within the power budget and is close tothe fractional optimal solution. We have formalized theabove proof sketch to obtain the following result. Thedetailed proof is omitted for lack of space.

Theorem 2 For any instanceI of the brownout problem(Eqn. 1), consider the modified problemI ′, which re-places the power function by its convex approximation.TheBrownMapsizing and placement methodology leadsto an integral approximation of the LP-relaxation ofI ′

for concave utility functions.

5 Experimental Evaluation

5.1 Prototype ImplementationWe have implementedBrownMap to manage a clusterconsisting of8 IBM Power6 JS-22 blade servers. Eachblade has 4 IBM Power6 4.0 GHz cores with8GB ofRAM installed. The blades are connected to a Cisco Net-work Switch for network access. The blade servers arehosted on an IBM Bladecenter H chassis. The LPARsin each blade get their storage from an IBM DS 4800storage controller with 146 SAN disks. They use a2Gbps Qlogic Fiber Channel link to communicate with theSAN.

TheBrownMappower manager is deployed on a ded-icated management server. The management server is aIBM Power5 1.5 GHz machine with 1 GB of RAM run-ning Linux kernel 2.6. In order to completely automatethe management, the management server has password-less ssh enabled with all the LPARs and VIOs. Moni-toring agents are deployed on all the LPARs and VIOs.The management server also talks to the BladeCenterAdvanced Management Module to get power data aboutthe managed blades.

5.2 Experimental SetupWe now describe our experimental setup to evaluate ourBrownMapimplementation.

5.2.1 Applications and Traces Used

We have deployed2 different applications on our testbed.The first application deployed in our testbed isRubis[25] that simulates an auction site like ebay.Rubis is atwo-tiered application with a web front end and a back-end database server. We denote the web tier asRubis-Web and the database tier asRubis-DB. Rubis exe-cutes the standard browse and buy mix that is suppliedwith the application. Our second application isdaxpy, aBLAS-1 HPC application [9].daxpy takes batch jobsas input and executes them. We run two variants ofdaxpy; namelydaxpyH as a high priority applicationanddaxpyL as a low priority application.

We have instrumented both the applications and cre-ated models for throughput versus consumed CPU re-source. The CPU resource is expressed in terms ofthe number of Power6 4.0 GHz cores for normalizationacross all LPARs (refer Figure 6). We use three differentutility models for the applications.daxpyH is given autility function that makes it the highest priority applica-tion in the testbed.daxpyL has a utility function with theleast utility per unit amount of resource consumed. Weattach a utility function forRubis that is intermediatebetweendaxpyH anddaxpyL. The exact utility func-tions for the three applications for a given throughput (ρ)are (i)daxpyH : Util(ρ) = 5ρ

33600, (ii) daxpyL: Util(ρ) =

ρ33600

, (iii) Rubis: Util(ρ) = 3ρ129

. A natural interpretationof the utility functions is that for the same resource used,daxpyH , daxpyL andRubis get utility in the ratio of5 : 1 : 3. Note that33600 is the throughput fordaxpyand129 is the throughput ofRubis at 1 core resourceusage.

We have created drivers for each application that takesa trace as input and creates workload for the applica-tion in a way that simulates the utilization given by thetraces as shown in Figure-7. We use utilization tracescollected from the production data center of a large en-terprise. More details about the traces are available in anearlier work of ours [34]. Among the 9 traces we consid-

8

0

0.2

0.4

0.6

0.8

1

1.2

0 20 40 60 80 100 120 140

Num

ber

of C

ores

Use

d

Throughput

Rubis-DBRubis-Web

0

0.2

0.4

0.6

0.8

1

0 5000 10000 15000 20000 25000 30000 35000

Num

ber

of C

ores

Use

d

Throughput

daxpy

(a) (b)Figure 6: Resource Vs SLA Models for (a) Rubis and (b) daxpy

0

50

1000

50

100

0 50 100 150 200 250 300

CP

U U

tiliz

atio

n

Time (Minutes)

Figure 7: Traces Used for evaluation: Broken into twoparts for better clarity

ered, 2 show periodic patterns with a time period of oneday.

5.2.2 Competing Methodologies

We comparedBrownMapagainst a few other methodolo-gies that an administrator can use to deal with brownouts.Baseline: The first methodology termed asBaselinemimics the scenario where aBrownout Managerdoesnot exist. The methodology is useful as it achieves themaximum utility without any resource restriction and canbe used to compare with the utility achieved by othermethodologies.Server Throttling: This methodology throttles all theservers in the shared data center to enforce the powerbudget. However, this approach was unable to bring thepower down significantly in order to meet the budget inmost cases.Proportional Switchoff: This methodology switches offenough number of servers to meet the power budget. Fur-ther, it reduces the resources allocated to the VMs in aproportional manner. Finally, it migrates LPARs frominactive bladeservers to active blade servers.

We compare all the methodologies with respect to theirability to meet the power budget and the utility achieved,while meeting the power budget. Further, we also studytheir throughput and migration overheads.

5.2.3 Experimental Timeline

The BrownMap prototype runs in three different phases,as shown in Figure-8. In theMonitoring Period, we col-lect monitoring data for the applications and the servers.Once sufficient historical data is available, we periodi-

MULTIPLE RUNS

Monitoring Period Evaluation PeriodConsolidation PeriodSPEEDUP

REALTIME

Reconfiguration Period

Figure 8: Experimental Timeline

cally run theBrownMapmethodology and reconfigurethe data center. Each consolidation interval can thus bebroken down intoReconfiguration Periodduring whichwe transition to the new configuration. This is followedby theEvaluation Periodduring which the configurationis allowed to run. Once the evaluation period is over, anew consolidation period starts.

The traces evaluated had weekly periodicity requiringthe Monitoring Period to be 7 days or more. Further,the reconfiguration activity (VM resizing and migration)depends on a large number of factors and should be re-peated multiple times for the results to be meaningful.As a result, each experimental run takes an inordinatelylarge time. In order to speed up the experiments, wedivided each run into different parts. TheMonitoringPeriodwas speeded up using the already available tracedata. TheReconfiguration Periodwas repeated multipletimes in real time and each measure is a statistical aver-age. TheEvaluation Periodwas run in real-time and thethroughput obtained by each application was measured.

5.3 Experimental ResultsWe performed a large number of experiments rangingfrom a2 server cluster to a6 server cluster. We also in-vestigated the performance ofBrownMapwith change inutility function and power budgets. We now report someof the important observations.

5.3.1 Prediction Accuracy

The BrownMappower manager depends on the predic-tion made by theWorkload Predictorto determine the ex-pected load in the next consolidation interval. TheWork-load Predictormakes an estimate for the next evaluationperiod and the maximum workload during the consolida-tion period is taken as the demand of the LPAR. Hence,we first study the accuracy of theWorkload Predictorused inBrownMap(Fig. 9(a)). An interesting observa-tion is that the peak workload (max) during the evalua-

9

1 2 3 4 5 6 7 8 90

0.05

0.1

0.15

0.2

Server Id

Nor

mal

ized

Pre

dict

ion

Err

or

Mean Abs ErrorPeak Estimation Err

0 0.5 1 30

0.05

0.1

0.15

0.2

0.25

Long Term History (Days)

Nor

mal

ized

Pre

dict

ion

Erro

r Mean Abs ErrorError in Peak Demand

(a) (b)

Figure 9: (a) Error in Prediction of Demand and Peak Demand (b) Error in Workload Prediction with MonitoringPeriod (Short Term Prediction Period = 2.5 Hrs, Weightα = 0.3)

tion period is a more stable metric and can be predictedto a greater accuracy than predicting the complete work-load for the evaluation period. We observe that the errorin predictingmax is bounded by10%, which is quite ac-ceptable. We also observe that the prediction accuracyfor both the time-varying workload during the evalua-tion period and the maximum workload improves withincrease in monitoring data available, exceeding95% fora history of3 days. An interesting observation we makeis that using a very low history for periodic traces mayintroduce more errors in long term prediction than notusing any history at all (0.5 days history has higher errorthan0 history in (Fig. 9(b)).

5.3.2 Comparative Evaluation

The first scenario we evaluated was on a 2-blade cluster.We created6 LPARs on the blades and deployed two in-stances ofdaxpyH anddaxpyL each. On the remainingtwo LPARs, we installed theRubis-DB andRubis-Webapplications. The cluster during normal operation wasconsuming500 watts. We simulate a brownout situationand send a trigger to theController that the budget hadchanged to250 watts.

We study the new configuration executed byPowerManagerin Fig. 10(a). ThePower Managerresizes theLPARs and moves them toblade1, resulting in a drop inpower. We observe the power drawn and utility obtainedusing (i) BrownMap, (ii) Baseline and (iii) ProportionalSwitchOff. The utility drawn byBaselineindicates themaximum utility that can be earned if no power man-agement actions are taken. We observe thatBrownMapis able to meet the power budget without any significantdrop in utility (about10% drop from maximum). On theother hand,Proportional SwitchOffincurs a45% drop inutility from the maximum to meet the power budget.

To understand howBrownMapis able to reduce powersignificantly without incurring a significant drop in util-ity, we observe the throughput achieved by each appli-cation in Fig. 10(b). We note thatBrownMapis able tokeep the throughput ofdaxpyH at the same level as theone before the brownout happened. On the other hand, it

takes a lot of resources fromdaxpyL leading to a signif-icant drop in throughput. TheBrownMapmethodologydoes not take any resources away fromRubis. However,since the overall system utilization is now higher, it leadsto a marginal drop in the throughput ofRubis. Hence,BrownMapcarefully uses the utility function to assignmore resources to applications with higher utility per unitresource consumed (Fig. 10(a)). This allowsBrownMapto meet the power budget with minimal drop in utility.

We now investigate a4 server cluster to investigatehow BrownMap deals with larger number of servers.We create12 LPARs with 4 instances ofdaxpyH anddaxpyL each. On the remaining4 LPARs, we install2 instances of Rubis, i.e.2 LPARs with Rubis-DBand2 LPARs with Rubis-Web. We again observe thatBrownMapadapts to the reduced power budget quicklywithout a significant drop in utility (Fig. 11(a)).Pro-portional SwitchOffagain has to sacrifice a significantamount (close to50%) of utility to achieve the powerbudget. The brownout is handled by carefully takingaway resources from the low priority application, thusmeeting the power budget with no more than10% dropin utility. We also evaluatedBrownMapwith a 6 nodecluster that conformed to the above observations. Forlack of space, we do not report those numbers in this pa-per.

5.3.3 Drop in Utility with Change in Power Budget

In real data centers, the reduction in available power dueto a brownout varies widely. Brownouts that happenbecause of increased demand may reduce the availablepower by10 or 20% whereas a major outage may reduceavailable power by as much as75%. Hence, we next varythe power budget and study the ability ofBrownMaptodeal with brownouts of differing magnitude in Fig. 12(a).

We observe thatBrownMapis able to meet the budgetfor upto50% drop in power with less than10% drop inutility. This is a direct consequence of the fact theBrown-Map first takes resources away from the lowest priorityapplications. These applications do not contribute muchto the overall data center utility and hence, we are able to

10

0

200

400

600

800

1000

1200

1400

0 20 40 60 80 100 120 140 0 50 100 150 200 250 300 350 400 450 500 550 600

Util

ity

Po

we

r (W

)

Time(Minutes)

BrownMap UtilityMaximum Utility

Proportional SwitchoffPower (Baseline)

Power (BrownMap) 20

40

60

80

100

120

0 20 40 60 80 100 120 140

Thr

ough

put

Time (Minutes)

DaxpyHDaxpyL

Rubis

(a) (b)

Figure 10: Consolidating 2 blades with a power budget of 250W: (a) Comparative Power Consumption and UtilityEarned. The Power drawn by Proportional SwitchOff is omitted as it closely follows BrownMap. (b) Throughputachieved by BrownMap

0

500

1000

1500

2000

2500

3000

0 20 40 60 80 100 120 140 160 0

200

400

600

800

1000

1200

Util

ity

Po

we

r (W

)

Time(Minutes)

BrownMap UtilityMaximum Utility

Proportional SwitchoffPower (Baseline)

Power (BrownMap) 20

40

60

80

100

120

0 20 40 60 80 100 120 140 160

Thr

ough

put

Time (Minutes)

DaxpyHDaxpyLRubis1Rubis2

(a) (b)

Figure 11: Consolidating 4 blades with a power budget of 500W: (a) Comparative Power Consumption and UtilityEarned. The Power drawn by Proportional SwitchOff is omitted as it closely follows BrownMap. (b) Throughputachieved by BrownMap

0

0.2

0.4

0.6

0.8

1

0 20 40 60 80 100

Nor

mal

ized

Util

ity

Drop in Power (% of Peak)

Average Utility Achieved

20

40

60

80

100

120

0 20 40 60 80 100 120

Thr

ough

put

Time (Minutes)

DaxpyHDaxpyL

Rubis

(a) (b)Figure 12: (a) Utility earned by BrownMap with change in Available Power. Both Power and Utility are normalized.(b) Fairness Scenario: Normalized Throughput Achieved by all applications

11

reduce the power without a proportional drop in utility. Itis clear in Fig. 12(a) that a proportional throttling mech-anism will not be able to meet the power budget withoutsacrificing significantly on utility.

5.3.4 Using Utility for Fairness

The BrownMappower manager is able to deal with re-duced power budgets by taking away resources from thelowest priority applications. This may lead to an unfairsituation, which may not be acceptable to all data centeradministrators. We next show that utility maximizationis a flexible tool that can capture diverse requirementsincluding fairness.

In order to investigate ifBrownMapcan meet a powerbudget, while ensuring fairness, we change the originalutility values to a more fair utility functions, namely (i)daxpyH : Util(ρ) = 5ρ

33600, (ii) daxpyL: Util(ρ) = 5ρ

33600,

(iii) Rubis: Util(ρ) = 5ρ129

. This ensures that for the sameresource used, all applications get the same utility. Westudy the throughput achieved by each application inFig. 12(b) and observe that all applications achieve thesame normalized throughput in such a scenario after thereconfiguration. This study clearly establishes the flex-ibility of BrownMap to deal with diverse optimizationscenarios from strict priority to fairness.

5.3.5 Reconfiguration Overheads

BrownMappower manager performs reconfiguration ac-tions to meet a power budget. We next investigate theimpact on application performance due to reconfigura-tion in Fig. 13(a). We observed that both our applica-tions daxpy andRubis have minimal throughput dropduring migration. It has been observed earlier that thereis a drop in throughput during migration due to hardwarecache flushes [32]. We conjectured that this minimaldrop in throughput during migration may be because ourdaxpy application executes small jobs whereasRubishas a very large memory footprint and does not use cachemuch. Hence, we used a medium memory footprint ap-plication, which showed a throughput drop of20% dur-ing migration.

We also observed during our experimental study thatthe migration leads to an increase in CPU utilization forthe VIO (Figure-13(b)). This was true for all the apli-cations studied. The increase in VIO CPU is because ofthe fact that VIO has to maintain the list of dirty mem-ory pages used by an LPAR during live migration. Thisincrease in CPU utilization can potentially lead to an im-pact on I/O performance for the LPARs during migration.However, it does not lead to a significant drop in through-put for I/O intensive applications like Rubis in our setup.This is because Rubis uses about25% of the resources

in our experimental setup and leads to an overhead ofno more than6% of one core in the VIO. However, fordata centers where I/O applications dominate, a drop inperformance may be seen during the live migration. Wealso observed that theReconfigurationalways completedin an average of4 minutes. Further, no reconfigurationtook longer than7 minutes. The scalability of the re-configuration is a direct consequence of the fact that weparallelize the configuration with one thread per server inthe data center. This distributed reconfiguration leads toa small reconfiguration period ensuring minimal impacton applications as well as enablingBrownMapto scaleto large shared data centers.

6 Related Work

The related work in the field of brownout managementincludes power minimization, virtual machine placementand power budgeting.

Power Minimization: There is a large body of workin the area of energy management for server clusters[19, 2, 7, 6]. Chen et al. [7] combine CPU scaling withapplication provisioning to come up with a power-awareresource allocation on servers. Chase et al. post a verygeneral resource allocation in [6] that incorporates en-ergy in the optimization framework. However, most ofthe power minimization work is in a non-virtualized set-ting, where short-term decisions in response to workloadvariations or power shortages can not be made. Otherapproaches that have been pursued to minimize energyconsumption include energy-aware request redistributionin web server and usage of independent or cooperativeDVFS [5, 22, 28, 16, 23, 12, 11, 17].

Virtual Machine Placement: The placement of virtualmachines on a server cluster has been studied in [3, 30].Bobroff et al. [3] describe a runtime application place-ment and migration algorithm in a virtualized environ-ment. The focus is mainly on dynamic consolidation uti-lizing the variability in workload. Similarly, the focusin [30] is on load balancing as opposed to power bud-geting. In [20], the authors advocate presenting guestvirtual machines with a set of soft power states suchthat application-specific power requirements can be in-tegrated as inputs to the system policies, without appli-cation specificity at the virtualization-level.

Power Budgeting: There are other efforts in reducingpeak power requirements at server and rack level by do-ing dynamic budget allocation among sub-systems [14]or blades by leveraging usage trends across collections ofsystems rather than a single isolated system [24]. How-ever, the goal of this work is not to operate within apower budget but to find a peak aggregate power. Incases where the operating power exceeds the predicted

12

20

30

40

50

60

70

80

90

100

0 2 4 6 8 10 12

Th

rou

gh

pu

t

Time (Minutes)

Medium Footprint ApplicationDaxpy

0

20

40

60

80

100

0 50 100 150

CP

U U

tiliz

atio

n

Time (Minutes)

VIO-1VIO-2

(a) (b)Figure 13: Impact on(a) application throughput and (b) VIO CPU Utilization during migration

peak, each server is throttled leading to a performancedrop. Moreever, these techniques do not leverage virtualmachine migration that allows a server to be freed up andput to a standby state. Finally, the presence of multiplevirtual machines with possibly different SLAs on a singleserver make resource actions at server-level impractical.

In an earlier work [32, 33], we have proposed powerminimization mechanisms that use virtual machine mi-gration to minimize power. However, the work dealswith power minimization as opposed to a power budget-ing problem. Also, loss in utility of virtual machines isnot considered. One may note that the power budgetingproblem involves power minimization as a componentand is a much more general problem. In this work, we ad-dress the problem of handling a brownout with minimumloss in utility and present the design and implementationof BrownMap, a power manager that quickly adapts to abrownout scenario by reducing the power consumptionwithin the budget.

7 Conclusion

Brownout is a scenario where there is a temporary re-duction in power, as opposed to a complete blackout.Brownouts affect data center operations by forcing themto bring down services and applications which may se-riously impact its revenue. In this paper, we have pre-sented the design and implementation ofBrownMap, apower manager that can deal with brownouts with mini-mal loss in utility (or revenue). We present both theoret-ical and experimental evidence to establish the efficacyof BrownMapin dealing with brownouts. Our evaluationon a real cluster of IBM Power6 JS-22 Blades using realproduction traces indicates thatBrownMapmeets powerbudget with a drop of10% in overall utility for a powerreduction of50% for realistic utility models. The re-configuration operation inBrownMapis completely dis-tributed, allowing it to scale with increase in the size ofdata center.

References

[1] Relocation Risk Management AFCOM’s Data Cen-ter Institute Issues Five Bold Predictions for the Fu-ture of the Data Center Industry; New Survey Iden-

tifies Labor Shortage, Power Failures and Virtual-ization as Major Issues. InBusiness Wire, March23, 2006.

[2] Ricardo Bianchini and Ram Rajamoni. Power andenergy management for server systems.Computer,pages 68–76, Nov, 2004.

[3] Norman Bobroff, Andrzej Kochut, and Kirk Beaty.Dynamic placement of virtual machines for man-aging sla violations. InIEEE Conf. Integrated Net-work Management, 2007.

[4] B. A. Carreras, D. E. Newman, I. Dobson, and A. B.Poole. Initial evidence for self-organized criticalityin electric power system blackouts. InHawaii In-ternational Conference on System Sciences, 2000.

[5] J. Chase and R. Doyle. Balance of Power: En-ergy Management for Server Clusters. Proc. Ho-tOS, 2001.

[6] Jeffrey S. Chase, Darrell C. Anderson, Prachi N.Thakar, Amin M. Vahdat, and Ronald P. Doyle.Managing energy and server resources in hostingcenters. InProc. SOSP, 2001.

[7] Y. Chen, A. Das, W. Qin, A. Sivasubramaniam,Q. Wang, and N. Gautam. Managing server energyand operational costs in hosting centers. InSigmet-rics, 2005.

[8] L. Cherkasova and R. Gardner. Measuring cpuoverhead for i/o processing in the xen virtual ma-chine monitor. InUsenix Annual Technical Confer-ence, 2005.

[9] DAXPY. http://www.netlib.org/blas/daxpy.f.[10] D. Economou, S. Rivoire, C. Kozyrakis, and

P. Ranganathan. Full system power analysis andmodeling for server environments. InWMBS, 2006.

[11] E. Elnozahy, M. Kistler, and R. Rajamony. Energy-efficient server clusters. InProc. of Workshop onPower-Aware Computing Systems., 2002.

[12] Mootaz Elnozahy, Michael Kistler, and Ramakrish-nan Rajamony. Energy conservation policies forweb servers. InProc. of USENIX Symposium onInternet Technologies and Systems, 2003.

[13] H.-Y. McCreary et al. Energyscale for ibm power6microprocessor-based systems. InIBM Journal forResearch and Development, 2007.

[14] Wes Felter, Karthick Rajamani, Tom Keller, andCosmin Rusu. A performance-conservingapproach

13

for reducing peak power consumption in server sys-tems. InProc. of SC, 2005.

[15] G. Reinsel G. Jenkins and G. Box. Time seriesanalysis: Forecasting and control. InPrentice Hall,1994.

[16] T. Heath, B. Diniz, E. V. Carrera, W. Meira Jr.,and R. Bianchini. Energy conservation in hetero-geneous server clusters. InProc. PPoPP, 2005.

[17] Tibor Horvath. Dynamic voltage scaling in mul-titier web servers with end-to-end delay con-trol. IEEE Trans. Comput., 56(4):444–458, 2007.Member-Tarek Abdelzaher and Senior Member-Kevin Skadron and Member-Xue Liu.

[18] R. Koller, A. Verma, and A. Neogi. Wattapp: Anapplication-aware power meter for shared clouds.In In Review, 2009.

[19] Charles Lefurgy, Karthick Rajamani, FreemanRawson, Wes Felter, Michael Kistler, and Tom W.Keller. Energy management for commercialservers.Computer, 36(12):39–48, 2003.

[20] Ripal Nathuji and Karsten Schwan. Virtualpower:coordinated power management in virtualized en-terprise systems. InProc. SOSP, 2007.

[21] North American Energy Reliability Cor-poration. 2007 Disturbance Index Public.http://www.nerc.com/, Last Accessed. April,2009.

[22] K. Rajamani and C. Lefurgy. On evaluatingrequest-distribution schemes for saving energy inserver clusters, 2003.

[23] Karthick Rajamani, Heather Hanson, Juan Ru-bio, Soraya Ghiasi, and Freeman L. Rawson III.Application-aware power management. InIISWC,pages 39–48, 2006.

[24] P. Ranganathan, P. Leech, D. Irwin, and J. Chase.Ensemble-level power management for dense bladeservers. InProc. ISCA, 2006.

[25] Rice Univerity Bidding System.http://rubis.ow2.org/.

[26] S. Rivoire, P. Ranganathan, and C. Kozyrakis. Acomparison of high-level full-system power mod-els. InHotPower, 2008.

[27] About IDEAS Relative Per-formance Estimate 2 (RPE2).http://www.ideasinternational.com/performance/.

[28] Cosmin Rusu, Alexandre Ferreira, ClaudioScordino, and Aaron Watson. Energy-efficientreal-time heterogeneous server clusters. InProc.IEEE RTAS, 2006.

[29] Stratfor Global Intelligence. Global MarketBrief: Emerging Markets’ Power Shortages.http://www.stratfor.com/analysis/globalmarketbrief emergingmarketspowershortages/,2008.

[30] A. Singh, M. Korupolu, and D. Mohapatra. Server-

storage virtualization: integration and load balanc-ing in data centers. InSC, 2008.

[31] KSEBoard. Availability Based Tariff.http://www.kseboard.com/availabilitybasedtariff.pdf,2008.

[32] A. Verma, P. Ahuja, and A. Neogi. pmapper: Powerand migration cost aware application placement invirtualized systems. InMiddleware, 2008.

[33] A. Verma, P. Ahuja, and A. Neogi. Power-awaredynamic placement of hpc applications. InICS,2008.

[34] A. Verma, G. Dasgupta, T. Nayak, P. De, andR. Kothari. Server workload analysis for powerminimization using consolidation. InUsenix ATC,2009.

14

BrownMap: Enforcing Power Budget in Shared Data Centers

Documents

Transcript of BrownMap: Enforcing Power Budget in Shared Data Centers