Prediction-based real-time resource provisioning for massively multiplayer online games

Future Generation Computer Systems 25 (2009) 785–793

Contents lists available at ScienceDirect

Future Generation Computer Systems

journal homepage: www.elsevier.com/locate/fgcs

Prediction-based real-time resource provisioning for massively multiplayeronline gamesRadu Prodan ∗, Vlad NaeInstitute of Computer Science, University of Innsbruck, Technikerstraße 21a, A-6020 Innsbruck, Austria

a r t i c l e i n f o

Article history:Received 1 May 2008Received in revised form14 October 2008Accepted 5 November 2008Available online 24 November 2008

Keywords:GamingReal timeModelling and predictionDistributed applicationsParallelism and concurrencyNeural nets

a b s t r a c t

Massively Multiplayer Online Games (MMOGs) are a class of computationally intensive client–serverapplications with severe real-time Quality of Service (QoS) requirements, such as the number of updatesper second each client needs to receive from the servers for a fluent and realistic experience. To guaranteethe QoS requirements, game providers currently over-provision a large amount of their resources, whichmakes the overall efficiency of provisioning and utilization of resources rather low and prohibits any butthe largest providers from joining the market.To address this deficiency, we propose a new prediction-based method for dynamic resource

provisioning and scaling of MMOGs in distributed Grid environments. Firstly, a load prediction serviceanticipates the future game world entity distribution from historical trace data using a fast and flexibleneural network-based method. On top of it, we developed generic analytical game load models used toforesee future hot-spots that congest the game servers and make the overall environment fragmentedand unplayable. Finally, a resource allocation service performs dynamic load distribution, balancing,and migration of entities that keep the game servers reasonably loaded such that the real-time QoSrequirements are maintained.Experimental results based on a realistic simulation environment demonstrate the advantages of

our prediction service compared to other conventional methods, especially due to its ability to adapt todifferent user load patterns, and a reduction of the average over-allocation from 250% (in the case of staticover-provisioning) to around 25% using our dynamic provisioning method.

© 2008 Elsevier B.V. All rights reserved.

1. Introduction

Online entertainment including gaming is a huge growth sectorworldwide. Massively Multiplayer Online Games (MMOGs) grewfrom 10 thousand subscribers in 1997 to 6.7 million in 2003, andthe rate is accelerating, being estimated to be 60 million peopleby 2011. The release of World of Warcraft in 2005 saw a singlegame break the barrier of 4 million subscribers worldwide. Themarket size shows equally impressive numbers, estimated by theEntertainment Software Association (ESA) to 7 billion US Dollars(USD) with an avid growth over 300% in the last 10 years. Incomparison, the Motion Picture Association of America (MPAA)reports a size of 8.99 billion USD and the Recording IndustryAssociation of America (RIAA) a size of 12.3 billion USD which hasstagnated (and even decreased by 2%) in the last 10 years. It istherefore expected that the game industry will soon grow largerthan both movie and music market sizes.MMOGs are large-scale simulations of persistent game worlds

comprising various objects or entities that can be classified in four

∗ Corresponding author. Tel.: +43 512 507 6445; fax: +43 512 507 2758.E-mail address: [email protected] (R. Prodan).

0167-739X/$ – see front matter© 2008 Elsevier B.V. All rights reserved.doi:10.1016/j.future.2008.11.002

categories: (i) avatars are in-game representations of the play-ers; (ii) bots or non-player characters (NPCs) are mobile entitiesthat have the ability to act independently; (iii) movable objects(e.g. boxes, guns) are passive entities which can be manipulatedbut do not initiate interactions; and (iv) immutable entities or decor.Today’s MMOGs operate as client–server architectures [1]

with game servers implementing an infinite loop and eachloop iteration (also called tick) performing certain steps suchas: (i) processing events coming from the connected clients(e.g. shootings, collection of items, chat); (ii) computing thenew state of the active entities; (iii) processing state updatesreceived from other servers; and (iv) broadcasting state updatesto the connected clients. All entities within a specific avatar’sarea of interest (usually a surrounding zone) are considered to beinteracting with the respective avatar and have an impact on itsstate. There are four main factors that affect the load of a gamesession: (i) the size of the game world; (ii) the total number ofentities; (iii) the density of entities within the area of interest;and (iv) the level of interaction. Obviously, the more populated theentities’ areas of interest are, and the more interactions that exist,the higher the load of the underlying game server will be.The game severs must respond with new game state informa-

tion to the distributed clients promptly within a given real-time

http://www.elsevier.com/locate/fgcs

http://www.elsevier.com/locate/fgcs

mailto:[email protected]

http://dx.doi.org/10.1016/j.future.2008.11.002

786 R. Prodan, V. Nae / Future Generation Computer Systems 25 (2009) 785–793

Fig. 1. The overall architecture.

interval to ensure a smooth, responsive, and fair experience for allplayers. Depending on the game type, typical response times mustbe lower than 100 ms (10 Hz) to ensure fluent play in online FirstPerson Shooter (FPS) action games. Failing to deliver timely sim-ulation updates leads to a degraded game experience and bringsunhappiness to players, who then cancel their accounts. An over-loaded game server delivers state updates to clients (i.e. move-ments and actions of teammates and opponents) at a lowerfrequency than required, which makes the overall environmentfragmented and unplayable.To support simultaneously thousands of concurrent players and

many more other game entities, MMOG operators (also calledCoordinators—see Fig. 1) typically install and operate a large staticinfrastructure consisting of hundreds to thousands of computersonto which they distribute the load of a game in order to providethe required QoS. For example, the operating infrastructure of theMMOG World of Warcraft [2] has over ten thousand computers.However, it has been proven that the demand of MMOGs is highlydynamic [3] and thus, even for the large providers that operateseveral titles in parallel, a large portion of their resources isunnecessary, which leads to a very inefficient resource utilization.In addition, this enterprise limitation has negative economicimpacts by preventing any but the largest hosting centres fromjoining the market, which dramatically increases prices, becausethose centres must be capable of handling peaks in demand, evenif the resources are not needed for much of the time.To alleviate this problem, we propose in this paper to use

the potential of Grid computing for inexpensive provisioning ofresources to MMOGs based on a novel dynamic and on-demandallocation strategy (see Fig. 1). We consider the Grid as anaggregation of inexpensive (free) resources that are virtualized andcan be accessed through the provisioning of several Web services.First of all, a load prediction service, sketched in Section 3, is

in charge of projecting the future distribution of entities in thegame world that will drive the resource allocation. For example,by timely foreseeing critical hot-spots (i.e. excessively populatedarea of interest generating a large number of interactions),one can dynamically provision additional servers on somenew resources and take timely load balancing actions thattransparently redistribute the game load before the serversbecome overloaded. Based on the entity distribution predictionand possible interactions, a load modelling service, described indetail in Section 4, uses analytical methods for estimating thegame server load. Finally, a resource allocation service, presentedin Section 5, uses the load information to provision the necessaryresources that accommodate the player load while guaranteeingthe real-time QoS constraints. We present experimental resultsthat validate our methods in Section 6 and conclude in Section 8.

Fig. 2. Zoning and mirroring.

2. Background

Spatial scaling of a game session is achieved through aconventional parallelization technique called zoning [4], based onsimilar data locality concepts as in scientific parallel processing.Zoning partitions the game world into geographical areas to behandled independently by separate machines (see Fig. 2). Zonesare not necessarily of same shape and size, but should have an evenloaddistribution that satisfies theQoS requirements. Today, zoningis successfully employed in slower-paced (compared fast-paced toFPS action games) adventure games, widely known as MassivelyMultiplayer Online Role Playing Games (MMORPGs) [5], where thetransition between zones can only happen through certain portals(e.g. special doors, teleportation, accompanied on the screen bya load clock or some standard animation video) and requires animportant amount of time. Typically, zones are started manuallyby the game operators based on the current load, player demand,or new game world and scenario developments.The second technique, called mirroring [6], targets paralleliza-

tion of game sessions with a large density of players located andinteracting within each other’s area of interest (see Fig. 2). Suchsituations are typical to fast-paced FPS action games inwhich play-ers typically gather in certain hot-spot action areas that overloadthe game servers, which are no longer capable of delivering stateupdates at the required rate. To address this problem, mirroringdefines a novel method of distributing the load by replicating thesame game zone on several CPUs. Each replicated server computesthe state for a subset of entities called active entities, while theremaining ones, called shadow entities (which are active in theother participating servers), are synchronized across servers. Itwasproven in previous research that the overhead of synchronizingshadow entities ismuch lower than the overhead of computing theload produced by active entities [6].The third technique, called instancing, is a simplification of

mirroring which distributes the session load by starting multipleparallel instances of the highly populated zones. The instancesare completely independent of each other, which means that twoavatars from different instances will not see each other, even ifthey are located at coordinates within their area of interest. Thistechnique is relatively easy to implement based on the zoningtechnique and is mostly employed for MMORPGs.Work at the University of Münster is developing the Real-

Time Framework (RTF) [7] that proposes to the game developers aportable API and optimized protocols which facilitate paralleliza-tion of game sessions using the zoning, mirroring, and instancingtechniques. RTF can be seen as the gaming platformon top ofwhichwe are developing our research methods and software.

R. Prodan, V. Nae / Future Generation Computer Systems 25 (2009) 785–793 787

3. Load prediction

Dynamically deciding and establishing a new parallelizationstrategy may be under certain circumstances an expensiveoperation taking several seconds that causes unacceptable delaysin the users’ experience if not hidden properly through proactiveprediction and resource allocation. The load of MMOGs is highlydynamic not only because of the high number of players connectedto the same game session, but also due to their often unpredictableinteractions. The interaction between players depends on theirposition in the gameworld and onwhether they find themselves ineach other’s area of interest. Ultimately, the load of a game sessiondepends therefore on the position of players in the game world,which is the task of the load prediction service.Two options are available for quantitative predictions in

MMOGs: explanatorymodels and time series prediction.While ex-planatory models can deliver good accuracy with little computa-tion, they are difficult to obtain for complex applications such asMMOGs, and are tightly coupled to the application instance andsometimes to the platform for which they have been constructed.With MMOGs relying on frequent and large updates, the explana-tory models quickly become unmaintainable. Thus, we base ourwork onprediction algorithms that use historical values to discoverpatterns in historical data series and extrapolate them into the fu-ture.Many such prediction algorithms have already been proposed

[8,9]. Simple prediction algorithms like exponential smoothingand variants thereof are computationally inexpensive and can beapplied in parallel on several data sets, but their predictive power islimited. More elaborated prediction algorithms like autoregressive(AR) models, integrated (I) models, moving average (MA) models,and combinations thereof like ARMA or ARIMA try to find the bestprediction model for the given data set. Although their predictivepower is higher, such methods are also more time consumingand resource intensive, thus being ill suited for highly dynamicMMOGs.We therefore decided on a solution based on neural net-

works [10] due to a number of reasons that make them suitable forpredicting the load of online game sessions in real-time, as we willexperimentally demonstrate in Section 6.1: they adapt to a widerange of time series, they offer better prediction results than othersimple methods, and they are sufficiently fast compared to othermore sophisticated statistical analysis.Our neural network-based prediction strategy, presented in

more detail in [10], is to partition the game world into subareas,where the size of a subarea needs to be small enough such thatits load can be characterized by the entity count. The overallentity distribution in the entire game world consists of a map ofentity counts for each subarea. The predictor uses one separateneural network for each subarea which receives as input the entitycount at equidistant past time intervals and delivers as output theentity count at the next time step. The predicted entity count forthe entire game world is the sum of all the subarea predictions,which will be used afterwards by the load modelling service forestimating the game server load.Two offline phases are required for utilizing the neural

network-based prediction. First, the data set collection phaseis a long process in which the game is observed by gatheringentity count samples for all subareas at equidistant time steps.The second training phase uses most of the previously collectedsamples as training sets, and the remainder as test sets. Thetraining phase runs for a number of eras, until a convergencecriterion is fulfilled. A training era consists of three steps: (i)presenting all the training sets in sequence to the network;(ii) adjusting the network’s weights to better fit the expectedoutput (the real entity count for the next time step); and (iii)

testing the network’s prediction capability with the different testsets. Separating the training from the test sets is essential inavoiding memorization and ensures that the network has enoughgeneralization potential for delivering good results on new datasets.

4. Load modelling

Having the future entity distribution produced by the predic-tion service, the goal of load modelling is to perform the mappingto resource load information. We propose in this section a genericand customizable analytical model for expressing the CPU load ofa game server, while in future work we plan to model the memoryand network load which are the other two types of resource thatare important for online games.Let us consider N clients connected to a distributed game

session aggregating a total of H (parallel, cluster) machines fromdifferent hosters. Let us further consider that inside the gameworld roam BE bots or NPCs (see Section 1). On each the machine,there are only AE active entities and C clients connected. Formodelling the load of one machine in a distributed game session,we distinguish three basic time consuming activities within onetick: (i) the computation of an interaction between two entitiesti; (ii) the reception of an event message from one client tm;and (iii) the update of one entity’s state received/sent from/toanother machine tu. In order to keep the complexity of this modelacceptable, we assume that NPC entities do not interact amongthemselves, which is true in the majority of cases.We model the CPU time tM spent for sending and receiving

messages from a server to each client (active avatars) as follows:

tM = C · tm. (1)

The CPU time tU spent by the server for processing state updatesfrom the other machines can be calculated as

tU = (N − C) · tu + (BE − AE) · tu + AE · tu, (2)

and the time tI spent by the server for computing the interactionsbetween the active entities is

tI = I · ti, (3)

where I is the total number of interactions involving the activeentities. Obviously, the computation of interactions that do notinvolve active entities is allotted to other machines.Assuming thenumber of entities isn, the interaction complexity

may range fromO(n) for games inwhich players aremostly solitaryor the game does not need tomakemany state changes or computecomplex environment reactions to O(n2) for games in whichmanyplayers acting individually are interacting, or to O(n3) for gamesin which groups of many players each are interacting. To reducethe computational load, most MMOGs simulate and send updatesonly for game world regions representing area of interest of eachavatar. When using such techniques, the interaction complexitymay becomeO(n× log n) fromO(n2), andO(n2× log n) fromO(n3).For quantifying the interactions between entities, we therefore

use a generic function f (e1, e2) which has to be instantiated foreach interaction type:

f (e1, e2) =

e1 + e2, for O (n) interaction;e1 · log (e2) , for O (n · log (n)) interaction;e1 · e2, for O

(n2)interaction;

e21 · log (e2) , for O(n2 · log (n)

)interaction;

e21 · e2, for O(n3)interaction,

(4)

where e1 and e2 are two classes of interacting entities.Let IC denote the number of avatars interacting with any other

entities (either avatars or NPCs). Furthermore, we define pci asthe average number of interactions involving active avatar entities


expressed as a percentage of IC . Analogously, we define pei as theaverage number of interactions involving active NPCs expressed asa percentage of BE. The total number of interactions is composed ofthe number of interactions between active avatars and the numberof interactions between active avatars and NPC entities:

I = pci · f (IC, IC)+ pei · f (IC, BE) . (5)

Consequently, the CPU time tI for processing the interactionsinvolving all active entities can be calculated as follows:

tI = (pci · f (IC, IC)+ pei · f (IC, BE)) · ti. (6)

Approximating the time for sending/receiving an eventmessage asequal to the time needed to update the state of one entity (tm = tu),the total CPU time consumed in one tick becomes

tC = (N + BE) · tu + (pci · f (IC, IC)+ pei · f (IC, BE)) · ti. (7)

Furthermore, quantifying ti with regard to tu as ti = pui · tu, the CPUtime consumed in one tick becomes

tC = (N + BE + pui · pci · f (IC, IC)+ pui · pei · f (IC, BE)) · ti. (8)

Finally, considering tSAT as the tick saturation threshold, we candefine the CPU load function:

LCPU =tCtSAT

=N + BE + pui · pci · f (IC, IC)+ pui · pei · f (IC, BE)

v, (9)

where v is the CPU speed expressed as an integer representing thenumber of tu-long tasks the CPU is able to perform in a tSAT -longtime interval.

5. Resource allocation

Based on the predicted resource load within the next timeinterval, the resource allocation service, presented in more detailin [3], arranges for the provisioning of the resources requiredfor a proper execution that guarantees a good experience to allplayers. A typical action performed by the resource allocationservice is to extend game sessions with new zones or replicationservers to accommodate an increased number of players duringpeak hours. Conversely, the resource allocation service deallocatesand merges multiple under-utilized game servers to improvethe resource utilization. Currently, our resource allocation serviceprovisions CPUs as specified by the load modelling service, whileother important resources such as memory and network will beconsidered in future work.An important aspect of the resource allocation is that resource

providers use in general different policies describing one time bulkand one resource bulk as the minimum allocation units for eachtype of resource. The measurement unit for the policy resourcesis a generic ‘‘unit’’ which represents the requirement for therespective resource from a fully loaded game server (e.g. one CPUunit represents the CPU demand for a fully loaded game zone).The game operators (or coordinators) make requests based on

the predicted load of the games they operate, and the hostingcentres respondwith offers based on their local time-space rentingpolicy. The resource allocation is realized by a request–offermatchmaking mechanism based on three criteria that favourthe game operator. First, the number and the type of resourcesrequestedmustmatchwith the offer, andwhen they do notmatch,an offer that includes at least the requested amounts is returned.Second, depending on the game latency tolerance, the resourcesclosest to the request are preferred. Third, to deal with data centrehosting policies, the finer grained resourceswith the shorter periodof reservation time are preferred.

Fig. 3. Game simulator snapshot.

After allocating the required resources, the resource allocationservice instructs the game servers through the RTF API [7] whichparallelization strategy to apply and which entities to migrateto new servers (see Section 2). Since the allocation of resourcesand establishing of a new game session load distribution schemeis a latency prone task (several seconds), an important aspect isto trigger it early enough using load prediction and modellingservices such that the users to not experience any lags during theirplay.

6. Experiments

We present in this section experimental results that validateour load prediction and resource allocation approaches.

6.1. Load prediction

To validate the neural network prediction, we developed adistributed FPS game simulator on top of the RTF library [7]supporting the zoning technique and the inter-zone migration ofentities (see Fig. 3). Themotivation for using a simulator is twofold:(i) we do not have available the exact coordinates of entitiesin existing proprietary real-world games [11]; and (ii) throughsimulation we are able to give further evidence that the playerinteraction determines the server load. We use our simulator forgenerating realistic load patterns such as entity interaction hot-spots or simply large numbers of entities managed by one gameserver.The entities in the simulation are driven by several Artificial

Intelligence (AI) profiles which determine their behaviour duringa simulation: aggressive determines the entity to seek and interactwith opponents; team player causes the entity to act in agroup together with its teammates; scout leads the entity fordiscovering uncharted zones of the game world (not guaranteeingany interaction); and camper simulates a well-known tactic inFPS games to hide and wait for the opponent. The four profileshave been selected to match the four behavioural profiles mostencountered in MMOGs [1]: the achiever, the explorer, thesocializer, and the killer, respectively. To also account for themixedbehaviour encountered in deployed MMOGs [1], each entity hasits own preferred profile, but can change the profiles dynamicallyduring the simulation. We further tried to get as close as possibleto real games by importing maps from a very popular FPS game(Counter Strike 1.6 [12]).We evaluated the prediction service using eight different data

traces generated with our simulator for a duration of 17 h with


Table 1Simulation trace data sets.

Data set Peak hours modelling Peak load Overall dynamics (17 h) Instantaneous dynamics (2 min)

Set 1 NoSet 2 NoSet 3 NoSet 4 No

Set 5 YesSet 6 YesSet 7 YesSet 8 Yes

(a) Prediction error during training. (b) Jordan–Elman network.

Fig. 4. Neural network training and tuning.

a sampling rate of 2 min (see Table 1). The first four data tracessimulate different scenarios of a highly dynamic FPS game, whilethe other four are characteristic to differentMMORPG sessions.Weuse this trace data for training the neural network as presentedin Section 3 until the process converges to a global minimum (seeFig. 4(a)).We compared the error of the neural network prediction

against other fast prediction methods such as moving average, lastvalue, and exponential smoothing, which have been proven to beeffective in large dynamic environments such as the Grid [13].Each prediction algorithm receives as input each trace data set,and outputs for each input set sample a prediction valid for thenext twominutes. For each prediction algorithm and trace data setcombination, we define the prediction error as the ratio betweenthe sumof all sample prediction errors and the sumof allN samplesin the trace data set (expressed in percentage):

PE =

N∑i=1

∣∣∣nreali − npredi ∣∣∣N∑i=1nreali

· 100, (10)

where nreali and npredi denote the real, respectively predicted, entitycounts at time step i.

6.1.1. Network typeThe two main neural network types we experimented with are

the classical feed-forward networks and the recurrent networks.In feed-forward networks, the stimuli propagate only from theinput layer towards the output layer, while in the recurrentnetworks there are also feedback loops towards a separate type ofneurons called context neurons. From the feed-forward category,we experimented with a simple multilayer perceptron (MLP) and amodified three-layer perceptron, which has a different input layerconsisting of fuzzy neurons intended to provide a different type ofsignal expansion. The recurrent network we use is a Jordan–Elmannetwork [14], depicted in Fig. 4(b).A summary of the average prediction error for all network types

is shown in Fig. 5. The simple MLP and the Jordan–Elman network

performed well, providing almost identical results, while themodified MLP struggled, especially during the experiments withmore fuzzy domains. Our conclusion is that the neural networksthat best fit to our problem are the MLP and Jordan–Elman and,therefore, our next experimentswill only consider these two types.

6.1.2. Network structureIn finding the network structure, we focus on tuning the

appropriate number of neurons on each layer.We fixed thenumberof layers to three since the performance of these types of networkshardly improves with more layers. From a strictly scientific pointof view, it is possible to add an infinite number of neurons onall layers, but this would add an increasing delay in making theprediction.The computational complexity for training a symmetrical

Hopfield neural network is in the order of O(n2 ·max

wij),

where n is the total number of neurons, and maxwijis

the maximum weight of all neurons [14]. The computationalcomplexity of using a Jordan–Elman network is in the same orderof magnitude, as it only differs through the back-propagationlinks connected to the context neurons instead of the input layerneurons (see Fig. 4(b)). By limiting the weights to non-exponentialvalues, we are left with an O(n2) complexity which will notintroduce relevant limitations to the network size for the trainingphase, since it is executed offline and only once for a game type andgame world combination. The prediction time equivalent to onenetwork activation, however, may be affected and could requiresome size limitations since it is done in real-time.To establish an upper bound on the number of neurons, let

us consider 100 Jordan–Elman-based predictors for 100 differentgame sub-zones, each of them built with a hyperbolic tangentas transfer function. The computational demand CAct for oneactivation of all networks is

CAct = N · (k+ Ctanh)+ NC ·(N2+ Ctanh

), (11)

where N is the number of neurons in the network (excludingthe context neurons), k is the number of weights, Ctanh is the


(a) Data set 1. (b) Data set 2.

Fig. 5. Network type experiments.

number of floating point operations (FLOPs) necessary to computethe tanh function, and NC is the number of context neurons. Forsimplicity reasons, let us further consider a fully interconnectednetworkwith the neurons evenly distributed between the first twolayers, ignoring the single output layer neuron. The weights andthe number of context neurons are k = 2 ·

(N2

)2and NC = N

2 ,respectively. Using a common value of 100 FLOPs for Ctanh, thecomputational demand for one activation becomes

CAct =N4·(2 · N2 + N + 600

). (12)

Let us further assume that we are running the predictor on acommodity 2.13 GHz Inter Core Duo processor with four floatingpoint operations per cycle offering a theoretical peak performanceof 8.52× 109 FLOPs. Taking into account a realistic 80% efficiencyand considering a maximum prediction time of 5 s (required bythe dynamic nature of FPS games), this sets an upper limit for thenetwork size at around 238 neurons which, as we will see in thefollowing experiment, is more than sufficient for our application.We therefore carried out a series of experiments using different

three-layered network structures while maintaining all otherrelevant parameters fixed [14]. Table 2 demonstrates that theprediction does not improve for networks with sizes one order ofmagnitude lower than the upper limit (good results are alreadyobtained with 15–20 neurons). The minimum prediction error isobtained for around 2–4 neurons in the second layer and twiceas many neurons on the first. Taking these two observations intoaccount, we conclude that the optimal network structure is [6, 3,1] (representing the number of neurons on each layer). Obviously,similar structures would also produce comparably good results(e.g. [9, 3, 1], [8, 4, 1]).

6.1.3. Prediction resultsThe results shown in Table 3 demonstrate that, apart from

producing better or at least equal predictions, the importantquality of our system is its ability to adapt to various types ofinput signal. More precisely, we had in our data sets three majortypes of signal: (i) signals with high instantaneous dynamic andmedium overall dynamic (sets 2, 3, and 4) were best approximatedby average; (ii) signals with low instantaneous dynamic (sets 6,7 and 8) were best approximated by last value; and (iii) signalswith medium instantaneous and medium overall dynamics (sets1 and 5) were best fitted by moving average and 50% exponentialsmoothing. The drawback of these conventional methods is thatit is not universally clear during a game session which of themshould be applied as the real-time prediction method for the nexttime step. Moreover, as the dynamics of the game may change,for example during peak hours, the best prediction method maychange too. Our neural network-based prediction successfullymanages to adapt to all these types of signal and always deliversgood prediction results.

Table 2Experimental results for different network structures.

Experiment Network structure Prediction error[in, hidden, out] Data set 1 (%) Data set 3 (%)

1 3, 1, 1 34.70 35.842 5, 1, 1 34.92 35.843 15, 1, 1 34.92 35.984 20, 1, 1 34.92 35.985 30, 1, 1 34.49 35.98

6 3, 2, 1 33.40 35.277 5, 2, 1 32.92 34.568 15, 2, 1 33.83 34.999 20, 2, 1 33.83 35.4110 30, 2, 1 34.49 35.70

11 3, 5, 1 34.27 35.2712 5, 5, 1 33.18 36.1313 15, 5, 1 32.97 35.5614 20, 5, 1 34.05 34.8515 30, 5, 1 34.27 35.56

16 3, 10, 1 33.62 36.8417 5, 10, 1 37.09 36.2718 15, 10, 1 34.27 35.4119 20, 10, 1 33.83 35.7020 30, 10, 1 34.05 35.56

To compare the results, we calculate the gain of our predictionas follows:

Gain =minPEother − PENNminPEother

, (13)

where PENN denotes the prediction error of the neural network andminPEother represents theminimumprediction error from the setof other prediction methods (average, moving average, last value,and the three exponential smoothing methods). We obtained thehighest gains for signals with high instantaneous dynamics (sets1 through 5) which represents the main characteristic of FPSgames (see Fig. 6). For data sets 6 through 8 characteristic toslower-paced MMORPGs, the best method other than our servicewas last value, which performed well, leaving little margin forimprovement. Nevertheless, our service performed better than theother methods, showing some gain in all cases.The average prediction time for one sub-zone using our service

is extremely fast: around 0.8 µs on a 2.66 GHz Intel Core Duoprocessor. Even with a few hundred sub-zones, the predictiontime remains below 1 s, which, considering a realistic predictiontime step of about 30–300 s, leaves at least 97% of this timeto the middleware for capacity management and load balancingdecisions (i.e. creation of a new zone/instance, migration ofplayers).

6.2. Resource allocation

In evaluating the efficiency of the resource allocation, we usedtraces collected from the official Web page of an MMORPG game


Table 3Comparison with other fast prediction methods.

Input data Prediction errorNeural network (%) Avg. (%) Moving avg. (%) Last val. (%) Exponential smoothing (%)

25 50 75

Set 1 32.23 39.69 39.25 44.51 40.04 37.83 39.60Set 2 28.61 30.79 34.60 40.31 34.19 35.78 36.74Set 3 33.00 37.98 39.40 47.36 38.34 39.06 41.92Set 4 32.18 34.48 39.78 48.28 42.90 41.55 43.97Set 5 16.58 25.06 17.98 23.31 19.85 19.59 20.27Set 6 4.94 23.55 8.13 5.08 10.89 7.03 5.66Set 7 11.17 48.26 18.88 11.84 20.50 16.06 12.53Set 8 5.51 15.26 8.45 5.66 12.96 8.78 5.90

Fig. 6. Neural network gain against next fastest prediction method.

called RuneScape [11]. RuneScape is not a traditional MMORPG,but it consists of several mini-games combining elements of RPGand FPS. Thus, various levels of player interactivity coexist in thesame game and the game load cannot be trivially computed withthe linearmodels employed in [15], but requiremore sophisticatedmethods like we presented in Section 4. The traces contain thenumber of players over time for each server group used by theRuneScape game operators.1We analysed over six months of data until March 2008 with

the metrics being evaluated every 2 min, giving over ten thousandsamples for each simulation ensuring statistical soundness. Thetestbed aggregates a total of 17 data centres located on fourcontinents and seven countries summarized in Table 4, aggregatinga total of 166machines, where eachmachine is capable of handlingat least one game server at full load (e.g. 2000 simultaneous clientsfor RuneScape). We used in our experiments a minimum resourcebulk of 0.25, and a time bulk of 6 h (i.e. deallocation cannot be doneearlier, as explained in Section 5).To quantify the effectiveness of resource allocation, we

measured the over-allocation as the percentage allocated from thetotal amount of resources necessary for the seamless executionof the MMOG that maintains the real-time QoS requirements. Wedefine the total resource over-allocationΩ(t) at time instance t asthe cumulated over-allocation of all machines participating in thegame session, where M is the number of machines in the session,αm(t) represents the allocated resource on machine m, and λm(t)represents the resource usage (the generated load) on machinem:

Ω(t) =

M∑m=1

αm(t)

M∑m=1

λm(t)· 100. (14)

1 We could not use these traces for the load prediction validation since the zoneson one server group are too large for an accurate prediction and contain no entityposition information.

Table 4RuneScape execution testbed.

Location Data centres Machines (total)Continent Country

Europe

Finland 2 8Sweden 2 8UK 2 20Netherlands 2 15

North America

US (West) 2 35Canada (West) 1 15US (Central) 1 15US (East) 2 32Canada (East) 1 10

Australia Australia 2 8

Fig. 7 comparatively displays the static and dynamic resourceallocation for the same workload. As expected, the average over-allocation is drastically reduced from 250% in the case of staticover-provisioning to around 25% (mostly due to the 6 h time bulk)for the dynamic allocation strategy.The computational overhead of the matchmaking process

performed by the game operators to the hosting centres isnegligible (order of milliseconds). The overhead of the effectiveallocation was not available to our monitoring tool and dependson the data centre hosting policy (either best effort or advancereservation). In computation science Grids, this is typically of theorder of few seconds, which is effectively hidden by the 2min timestep of our prediction service.

7. Related work

In the field of prediction in distributed environments, theNetworkWeather Service [16] uses a variety of statistical methodsin parallel that characterize the load parameters at the resourcelevel. The system adapts to different input signals by choosingevery time the method with the lowest mean square error inthe previous prediction steps. Our service, in contrast, relies onthe neural network’s capability adapt to changes in signal trendsthrough a thorough training phase.At the application level, Iverson [17], Gibbons [18], and

Smith [19] use historical information to predict the paralleljob runtime in a distributed environment through methods likeanalytical benchmarking, linear regression, genetic algorithms,and greedy search. In our work, we target a novel class ofapplications which, in contrast to scientific applications thatusually belong to one single user, are characterized by a largenumber of online users that interact (usually through competition)within the same application instance. Our prediction method isalso based on historical information, but uses a neural network-based prediction algorithm.Neural network-based predictions have been successfully

applied to a lot of applications in different fields of research:Taylor [20] and Hsu [21] employ neural networks for predicting


Fig. 7. Static versus dynamic resource allocation.

loads on electrical systems; Maqsood [22] uses a set of neuralnetworks for weather forecasting; Huisken [23] studies the useof neural networks for short-term prediction of traffic flow;and Litke [24] presents a neural network-based module forpredicting the computational workload of jobs in commercial Gridinfrastructures.In the gaming area, prediction is used at the client side in fast-

paced games for hiding synchronization lags generated by networklatency and bandwidth limitations [25–27]. Complementary tothese approaches, our prediction service runs at the server side,preventing saturation of game servers.The problem of dynamically allocating geographically dis-

tributed resources to applications has been a popular topic inGrid computing research [28–30]. Recent work has investigatedresource allocation mechanisms across single- and multi-clusterGrids [31,32] for typical Grid workloads comprising batches ofscientific and engineering jobs [33]. Unlike MMOGs, these Gridapplications do not change their resource requirements at run-time. Moreover, the Grid resource allocation policies only allowfor whole resources to be allocated at a time, while our work alsoconsiders the sub-unitary allocation sizes specific to business datacentres.Closest to our work, the benefit of provisioning resources from

single data centres has been evaluated for databases and Webservices [34,35]. Our work differs from these approaches in twosignificant aspects. First, MMOGs have a different load model, andin particular their load also depends on the interaction betweenusers. Second, we consider multiple data centres to handle thedifferent load patterns in different geographical locations specificto MMOGs.Alongside client–server architectures, peer-to-peer architec-

tures have also been employed in the design of MMOGs; how-ever, so far most of the efforts have been academic studies[36–38]. Peer-to-peer architectures require that each computerparticipating in a game session can act both as client andserver. Peer-to-peer MMOGsmay potentially bemore scalable andcheaper to build, but other notable barriers, such as security andconsistency control, which can be difficult to address given thatclients are easily hacked, hinder them for being employed in pro-ductionmode. Outback Onlinemay have been the first commercialattempt to develop a peer-to-peer MMOG. The project was discon-tinued in 2007 because of difficulties in raising funding.

8. Conclusions

We have proposed a new prediction-based method for dy-namic resource provisioning and scaling of real-time MMOGs indistributed Grid environments.

Firstly, we developed a load prediction service that accuratelyestimates the future gameworld entity distribution fromhistoricalinformation using a fast and flexible neural network-basedapproach. Our approach is based on distributing the game worldin subareas of reasonable size, whose entity count can quicklyand accurately be approximated through a well-trained neuralnetwork using historical information.We developed a game simulator which uses several AI

entity modelling patterns for generating a range of realisticload traces. We showed a series of experiments for tuning thenetwork parameters (e.g. structure, type) that were crucial forobtaining good prediction results. We presented experimentswhich demonstrate the capability of our predictor to adapt toinput signals with different characteristics modelling various loadpatterns, which other conventional prediction methods fail toachieve. In addition, our method is also extremely fast, whichmakes it suitable to applications with real-time requirements likeonline games.On top of the prediction service,wedeveloped a generic analyti-

cal game loadmodel used to foresee future hot-spots that overloadthe game servers and make the overall environment fragmentedand unplayable. Based on the load prediction information, a re-source allocation service performs dynamic provisioning, proac-tive load balancing, and migration of entities that keep the gameservers reasonably loaded to maintain the real-time QoS require-ments. Using our allocation method, we demonstrated a 10-foldimprovement in resource provisioning for a real-world MMORPGgame.In the future we plan to validate our prediction and allocation

methods on the real Quake 3 [39] FPS game which is currentlybeing ported onto RTF at the University of Münster. Moreover, weplan to model and integrate additional resource types relevant forMMOG provisioning such as memory and network.

References

[1] Richard Bartle, Designing virtual worlds, New Riders Games, 2003.[2] Inc. Blizzard Entertainment, World of warcraft. http://www.worldofwarcraft.com/.

[3] Vlad Nae, Alexandru Iosup, Radu Prodan, Dick Epema, Thomas Fahringer,Efficient management of data center resources for massively multiplayeronline games, in: International Conference on High Performance Computing,Networking, Storage and Analysis (Supercomputing), IEEE Computer SocietyPress, 2008.

[4] Wentong Cai, Percival Xavier, Stephen J. Turner, Bu-Sung Lee, A scalablearchitecture for supporting interactive games on the internet, in: PADS ’02:Proceedings of the SixteenthWorkshoponParallel andDistributed Simulation,IEEE Computer Society, Washington, DC, USA, 2002, pp. 60–67.

[5] MMORPG COM, Your headquarters for massive multiplayer online role-playing games. http://www.mmorpg.com/.

[6] JensMüller-Iden, Sergei Gorlatch, Rokkatan: Scaling an RTS game design to themassively multiplayer realm, Computers in Entertainment 4 (3) (2006) 11.

http://www.worldofwarcraft.com/




http://www.mmorpg.com/





[7] Frank Glinka, Alexander Ploss, Jens Müller-Iden, Sergei Gorlatch, RTF: Areal-time framework for developing scalable multiplayer online games,in: NetGames, ACM Press, 2007.

[8] G.E.P. Box, G.M. Jenkins, G.C. Reinsel, Time Series Analysis, Forecasting andControl, Prentice Hall, 1994.

[9] S. Makridakis, S.C. Wheelwright, R.J. Hyndman, Forecasting: Methods andApplications, Wiley, 1998.

[10] Vlad Nae, Radu Prodan, Thomas Fahringer, Neural network-based loadprediction for highly dynamic distributed online games, in: Euro-Par, SpringerVerlag, 2008.

[11] Ltd Jagex. Runescape. http://www.runescape.com, Nov 2007.[12] Inc. GameData. Counter strike. http://www.counter-strike.com.[13] Rich Wolski, Experiences with predicting resource performance on-line

in computational grid settings, ACM SIGMETRICS Performance EvaluationReview 30 (4) (2003) 41–49.

[14] Simon Haykin, Neural Networks: A Comprehensive Foundation, 1st ed.,Prentice Hall, PTR, 1994.

[15] L. Ye, M. Cheng, System-performance modeling for massively multiplayeronline role-playing games, IBM Systems Journal 45 (1) (2006).

[16] Rich Wolski, Neil T. Spring, Jim Hayes, The network weather service: Adistributed resource performance forecasting service for metacomputing,Future Generation Computer Systems 15 (5–6) (1999) 757–768.

[17] Michael A. Iverson, Fusun Ozguner, Lee Potter, Statistical prediction oftask execution times through analytic benchmarking for scheduling in aheterogeneous environment, IEEE Transactions on Computers 48 (12) (1999)1374–1379.

[18] RichardGibbons, A historical application profiler for use byparallel schedulers,in: DrorG. Feitelson, Larry Rudolph (Eds.), Job Scheduling Strategies for ParallelProcessing, Springer Verlag, 1997, pp. 58–77.

[19] Warren Smith, Ian Foster, Valerie Taylor, Predicting application run timesusing historical information, in: Lecture Notes in Computer Science, vol. 1459,1998, pp. 122–144.

[20] R. Taylor, J.W. Buizza, Neural network load forecastingwithweather ensemblepredictions, IEEE Transactions on Power Systems 17 (3) (2002) 626–632.

[21] C.C. Hsu, C.Y. Chen, Regional load forecasting in Taiwan—Applications ofartificial neural networks, Energy Conversion andManagement 44 (12) (2003)1941–1949.

[22] I. Maqsood, M.R. Khan, A. Abraham, An ensemble of neural networks forweather forecasting, Neural Computing & Applications 13 (2) (2004) 112–122.

[23] Giovanni Huisken, Soft-computing techniques applied to short-term trafficflow forecasting, Systems Analysis Modelling Simulation 43 (2) (2003)165–173.

[24] A. Litke, K. Tserpes, T. Varvarigou, Computational workload predictionfor grid oriented industrial applications: The case of 3d-image rendering,in: International Symposium on Cluster Computing and the Grid, vol. 2, IEEEComputer Society Press, 2005, pp. 962–969.

[25] Y. Bernier, Latency compensating methods in client/server in-game protocoldesign and optimization, in: Proceedings of the Game Developers Conference,2001.

[26] S. Bonham, D. Grossman, W. Portnoy, K. Tam, Quake: An example multi-user network application–problems and solutions in distributed interactivesimulations, Cse 561 term project report, University of Washington, 2000.

[27] J. Färber, Network game traffic modelling, in: Proceedings of the 1st workshopon Network and system support for games, 2002, pp. 53–57.

[28] E. Elmroth, J. Tordsson, Grid resource brokering algorithms enabling advancereservations and resource selection based on performance predictions, FutureGeneration Computer Systems 24 (6) (2008) 585–593.

[29] Eduardo Huedo, Rubén S. Monteroa, Ignacio M. Llorentea, A recursivearchitecture for hierarchical grid resource management, Future GenerationComputer Systems (2008).

[30] BangYuWu, Chi-Hung Chi, Zhe Chen,Ming Gu, JiaGuang Sun,Workflow-based

resource allocation to optimize overall performance of composite services,Future Generation Computer Systems, (2008) (acceptedmanuscript, in press).

[31] Alexandru Iosup, Dick Epema, Todd Tannenbaum, Matthew Farrellee, MironLivny, Inter-operating grids through delegated matchmaking, in: Supercom-puting Conference, ACM Press, 2007.

[32] Mumtaz Siddiqui, Alex Villazón, Thomas Fahringer, Grid allocation andreservation—Grid capacity planning with negotiation-based advance reserva-tion for optimized qos, in: Supercomputing Conference, 2006, p. 103.

[33] Alexandru Iosup, Catalin Dumitrescu, Dick H.J. Epema, Hui Li, Lex Wolters,How are real grids used? The analysis of four grid traces and its implications,in: International Conference on Grid Computing, IEEE Computer Society, 2006,pp. 262–270.

[34] Jeffrey S. Chase, Darrell C. Anderson, Prachi N. Thakar, Amin Vahdat, RonaldP. Doyle, Managing energy and server resources in hosting centres, in: SOSP,2001, pp. 103–116.

[35] Bhuvan Urgaonkar, Prashant J. Shenoy, Abhishek Chandra, Pawan Goyal, Dy-namic provisioning of multi-tier internet applications, in: Second Interna-tional Conference on Automatic Computing, IEEE Computer Society, 2005,pp. 217–228.

[36] Luther Chan, James Yong, Jiaqiang Bai, Ben Leong, Raymond Tan, Hydra — amassively-multiplayer peer-to-peer architecture for the game developer, in:6th AnnualWorkshop on Network and Systems Support for Games (Netgames2007), Melbourne, Australia, September 2007.

[37] B. Knutsson, H. Lu, W. Xu, B. Hopkins, Peer-to-peer support for massivelymultiplayer games, in: Infocom, IEEE Computer Society Press, 2004.

[38] Abdennour El Rhalibi, Madjid Merabti, Yuanyuan Shen, Aoim in peer-to-peermultiplayer online games, in: SIGCHI International Conference on Advancesin Computer Entertainment Technology, ACM Press, New York, NY, USA, 2006,p. 71.

[39] id Software. Quake. http://www.idsoftware.com/games/quake/quake/.

Radu Prodan received his Master’s degree in ComputerScience from the Technical University of Cluj-Napoca,Romania, in 1997. Between 1998 and 2001 he servedas Research Assistant in Switzerland at ETH Zurich,University of Basel, and the Swiss Centre for ScientificComputing. In 2001 he joined the Institute for SoftwareScience, University of Vienna, where he earned hisPh.D. in 2004 from the Vienna University of Technology.Prodan is currently an assistant professor at the Instituteof Computer Science, University of Innsbruck. He isinterested in distributed software architectures, compiler

technology, performance analysis, and scheduling for parallel and Grid computing.Prodan participated in several national and European projects and is currentlyworkpackage leader in the IST-034601 edutain@grid project. He is the author ofover 40 papers, including one book, seven journal articles, and one IEEE best paperaward.

Vlad Nae received his Diploma Engineer degree in Com-puter Science from the PolitehnicaUniversity of Bucharest,Romania, in 2006. Since 2006 he has been employed as aResearch Assistant at the Institute of Computer Science,University of Innsbruck. His research interests are cen-tred around distributed systems directed towards pro-viding support for highly dynamic and resource hoggingapplications such as massively multiplayer online games.Nae participated in several national and European projectsand is currently involved in the IST-034601 edutain@gridproject. He is the author of eight papers, including one

journal article.

http://www.runescape.com




http://www.counter-strike.com




http://www.idsoftware.com/games/quake/quake/







Prediction-based real-time resource provisioning for massively multiplayer online games

Documents

Transcript of Prediction-based real-time resource provisioning for massively multiplayer online games