Methodology for Predicting Performance of Distributed and Parallel Systems

16
Performance Evaluation 18 (1993) 189-204 189 North-Holland Methodology for predicting performance of distributed and parallel systems Rakesh Kushwaha Department of Computer and Information Science, New Jersey Institute of Technology, Newark, NJ 07102, USA Received 14 October 1991 Revised 8 September 1992 Abstract Kushwaha, R., Methodology for predicting performance of distributed and parallel systems, Performance Evaluation 18 (1993) 189-204. This paper describes an accurate and efficient method to model and predict the performance of distributed/parallel systems. Various performance measures, such as the expected user response time, the system throughput and the average server utilization, can be easily estimated using this method. The methodology is based on known product form queueing network methods, with some additional approximations. The method is illustrated by evaluating performance of a multi-client multi-server distributed system. A system model is constructed and mapped to a probabilistic queueing network model which is used to predict its behavior. The effects of user think time and various design parameters on the performance of the system are investigated by both the analytical method and computer simulation. The accuracy of the former is verified. The methodology is applied to identify the bottleneck server and to establish proper balance between clients and servers in distributed/parallel systems. Keywords: Distributed and parallel systems; performance modeling; performance evaluation; queueing networks; simulation. I. Introduction Several considerations, such as reliability, per- formance, ability to incorporate a wide class of machine architectures, functioning in a range of network environments, the image presented by the system to an end user, govern the design of distributed and parallel systems [14]. One major concern with any new system architecture is its performance. For such a system to be useful, it must present as small a performance degradation as possible. Those users who do not require the distributed/parallel facility should experience no change in performance for their applications. Those users who do require this facility should see a seemingly minimal impact on their applica- tions. Performance degradation may be due to Correspondence to: R. Kushwaha, Dept. of Computer and Information Science, New Jersey Institute of Technology, Newark, NJ 07102, USA. network delays or synchronization between dif- ferent parallel facilities. Distributed and parallel systems may be repre- sented by the interactions among the following entities: servers, network and information units. Clients are any processing units in the system which invoke operations and request access to the information units. Servers are the nodes which service the requests made by the clients and manipulate the information units in parallel. In- formation units could be any data unit (e.g. file), or code, messages, user-requests or executing processes. This paper describes an accurate and efficient method for predicting the performance of such a class of distributed and parallel systems. Design parameters, such as information access mechanisms which could be either remote access [2] or information transfer [11], cache size (local memory at client), client-server balance and data location, can be analyzed using the method we describe. 0166-5316/93/$06.00 © 1993 - Elsevier Science Publishers B.V. All rights reserved

Transcript of Methodology for Predicting Performance of Distributed and Parallel Systems

Performance Evaluation 18 (1993) 189-204 189 North-Holland

Methodology for predicting performance of distributed and parallel systems

Rakesh Kushwaha Department of Computer and Information Science, New Jersey Institute of Technology, Newark, NJ 07102, USA

Received 14 October 1991 Revised 8 September 1992

Abstract

Kushwaha, R., Methodology for predicting performance of distributed and parallel systems, Performance Evaluation 18 (1993) 189-204.

This paper describes an accurate and efficient method to model and predict the performance of distributed/parallel systems. Various performance measures, such as the expected user response time, the system throughput and the average server utilization, can be easily estimated using this method. The methodology is based on known product form queueing network methods, with some additional approximations. The method is illustrated by evaluating performance of a multi-client multi-server distributed system. A system model is constructed and mapped to a probabilistic queueing network model which is used to predict its behavior. The effects of user think time and various design parameters on the performance of the system are investigated by both the analytical method and computer simulation. The accuracy of the former is verified. The methodology is applied to identify the bottleneck server and to establish proper balance between clients and servers in distributed/parallel systems.

Keywords: Distributed and parallel systems; performance modeling; performance evaluation; queueing networks; simulation.

I. Introduction

Several considerations, such as reliability, per- formance, ability to incorporate a wide class of machine architectures, functioning in a range of network environments, the image presented by the system to an end user, govern the design of distributed and parallel systems [14]. One major concern with any new system architecture is its performance. For such a system to be useful, it must present as small a performance degradation as possible. Those users who do not require the dis t r ibuted/paral le l facility should experience no change in performance for their applications. Those users who do require this facility should see a seemingly minimal impact on their applica- tions. Performance degradation may be due to

Correspondence to: R. Kushwaha, Dept. of Computer and Information Science, New Jersey Institute of Technology, Newark, NJ 07102, USA.

network delays or synchronization between dif- ferent parallel facilities.

Distributed and parallel systems may be repre- sented by the interactions among the following entities: servers, network and information units. Clients are any processing units in the system which invoke operations and request access to the information units. Servers are the nodes which service the requests made by the clients and manipulate the information units in parallel. In- formation units could be any data unit (e.g. file), or code, messages, user-requests or executing processes. This paper describes an accurate and efficient method for predicting the performance of such a class of distributed and parallel systems. Design parameters, such as information access mechanisms which could be either remote access [2] or information transfer [11], cache size (local memory at client), client-server balance and data location, can be analyzed using the method we describe.

0166-5316/93/$06.00 © 1993 - Elsevier Science Publishers B.V. All rights reserved

190 R. Kushwaha / Performance of distributed and parallel systems

In this method, information units (messages, files, user-requests, etc.) are represented mathe- matically as customers of different classes in a queueing network model, while servers, clients and the network are modeled as service centers. This probabilistic queueing model is used to eval- uate performance measures, such as the average response time of a request made by a client, the average utilization of the service centers and their throughput. The response time is defined as the time interval which starts when a client requests some information and ends when the request is serviced by delivery of the information. It is the average time a request spends in the system. Short response time is a characteristic of good performance. The response time of a system not only depends on the transmission propert ies of the interprocess communication primitives, their implementat ion, and supporting protocol, but also on the manner in which the information units are located, managed and accessed. Cache size and c l i en t / se rve r ratio also affect the average re- sponse time of the system.

The system we refer to operates as follows: when a user at a client makes a request, the request "enters" the system and proceeds to re- ceive service at different service centers. During this t ime the user waits for a response. After some time interval, when the request is satisfied, the user enters a " think t ime" and then generates a new request.

The prediction method is illustrated by apply- ing it to a multi-server, multi-client distributed system. A detailed analysis of the system is pre- sented along with the results. Both analytical model and computer simulations are used to in- vestigate the effect of user think times on the performance of the system. The accuracy of the former is validated by the latter.

Section 2 describes the multi-server, multi-cli- ent distributed system and maps it to a queueing network model. A complete analysis of the dis- tr ibuted system is given in Section 3, illustrating how the prediction method may be used to evalu- ate the performance of the d is t r ibuted/para l le l systems. A numerical example is considered in Section 4, in which the results obtained from the analytical solution are validated against the simu- lated ones, In Section 5, an approximate method is used to calculate cache size. The results of the method are applied, in Section 6, to identify and isolate the bott leneck server and establish proper c l i en t / se rver ratio in d is t r ibuted/para l le l sys- tems. Section 7 concludes the paper by summariz- ing the method described and emphasizing its uses and advantages.

2. The system being modeled

2.1. System description

We consider a distributed system with n pro- cessors, which we call clients (ci, 1 ~< i ~< n), and m servers (sj, 1 ~<j ~< m), connected via an arbi- trary network (t). The clients request access and usage to a set of files; these files reside at a set of servers. A file represents any information unit in the system. File requests initiated by a client are serviced by the server that hosts the file being requested. A file is in one of two states: free or busy. A file is busy if a copy of it has been allocated to some client; otherwise it is free.

A server receives a file request from a client, processes the requests, and transfers a file copy to the client, if the file is free. This action marks the file as "busy". Each client has a temporary local memory, the cache, which holds the copies

Rakesh Kushwaha received the B.E. degree in mechanical engineering from University of Delhi, New Delhi, India, in 1986, and the M.S. degree in computer science from the New Jersey Institute of Technology, Newark, NJ, in 1989, where he is currently a Ph.D degree Candidate in the Department of Computer and Information Science.

Since 1990 he has been on the faculty of Department of Computer and Information Science, as a Special Lecturer, at the New Jersey Institute of Technology, Newark, NJ. His active research interests include performance modeling and analysis of distributed and parallel systems, load balancing, and issues in distributed operating systems. He is a member of SIGMETRICS, the Association of Computing Machin- ery's Special Interest Group concerned with computer system performance.

R. Kushwaha / Performance o f distributed and parallel systems 191

of the files obtained by the client. A file in the client's memory is in one of two states: active or non-active. A file is active if its is currently used by the client and is non-active if it was used by the client in the past but still resides in the memory.

Servers and clients communicate via messages. Messages can be of the following three types: f i l ereques tmessage , f i leactive_message and confirmmessage, denoted by M 1, M 2 and M 3, respectively. The messages circulate among dif- ferent nodes depending upon the availability of the file requested. Formally, any information unit in the system at any instant of time can be de- fined as: rx , y , z (x=M 1, M z, M3, F; y, z = ci, sj, t). In this expression F denotes a file, the subscript x denotes the class of the information unit (file or any messages type), y is the source (service center) from which the information unit is coming, and z is the destination (service cen- ter).

Depending upon the state of the file at a server and a client, messages and files circulate among different nodes. Three possible cases are discussed below.

(1) File free at the server: For a cache miss (i.e. when a user requests a file that is not located in the client's memory), the client (Cg) sends a file_request_message (rMl,ci, t) to the network (t). The network ( t ) in response forwards the file_re- questmessage (rM~,t,S ,) to the appropriate server (sj). If the file is free, the server (sj) sends a copy of the file to the client, (rF,Sj,t, rF,t,c~). After re- ceiving a copy of the file, the chent sends the confirm_message (rM3,c,,t), to the network which passes the message (rM, t~) to the server. The

~ ' ' ) .

sequence of messages described above can be expressed as the following subchain: FM 1,ci,t --~ rMl, t ,s j ~ rF,s),t ~ rF,t,Ci -") rM3,ci, t --~ rMs,t,s i

Once the file is allocated to any client, the file is marked "busy" at the server.

(2) File busy at the server: If the requested file is busy at the server (si), then the copy of the file resides at some other client (c~,~.i). The server sends a filerequest_message (rMl,Sy,t , rMl,t,c k) to the client (ck). If the requested file is in a non-ac- tive state at the client c~, the client transfers the copy of the file to the server (rFxka, rF,t,,). The server (s~) makes a copy of the file and then transfers the new copy of the file to the client (c~). On receiving the file the latter responds to

the server with confirm _message (messages r M3,c,, , and rM3ts). This sequence of messages can be describe'd by the following sub-chain:

rMl,ci,t ~ rMi,t ,sj --~ rMl,sj,t ~ rMl,t,ck ~ rF,ck.t "+ rF,t,sj

rF,s j , t ~ rF , t , c i ~ rM3,ci, t - '~ rM3, t , s ~

(3) File active at the client: If the file is in an active state at the client (ck), the client issues a f i leactive_message (rM2,,,~,t) for the server, who in turn passes the message to the client who initially requested the file. The subchain of mes- sages is then:

rMl,Ci,t ~ rMl, t ,s j "'> rMi,sj,t ~ rM l,t,ck ~ rM2,ck,t

-"> rM2.t,.s) ~ rM2,sj, t --') rM2,t,c i

The client in this case does not have to reply back with a confirm _messages since the server is aware of the file being active.

The cache at each client is local memory im- plemented as a Least Recently Used (LRU) stack. The least recently used file is at the bot tom of the stack, and the most recently used is at the top. If the local memory is full, space for the new incom- ing file is made by removing one or more files from the memory. The least recently used files are transferred to the respective servers. If the client sends one or more files (rF,c,t) , the follow- ing sub-chain should be added to the two sub- chains in cases 1 and 2, in the event of memory being full:

rF,ci,t --'> rF,t,sj

All nodes of the system--servers , clients and the ne twork - -hand le files (r E ) and messages ( r M ) 1. The network serves r g b e f o r e r F o n a

First Come First Serve (FCFS) basis. A server handles the file request_message (rM~) on FCFS basis for the different files and in the t ime-stamp (TS) order for the same file. The file_re- quest message (rg~) for a particular file queues at the server and stays in the queue until either the confirm_message (rg3) from the client which re- quested the file or the f i le_activemessage (rM2) from the client where the file was active, is re- ceived by the server. A file request must stay pending at the server for the period of time that: (1) the file is busy at the server, and (2) the file is in non-active state at a client.

r M denotes messages in general, without class distinction and without particular source and destination node.

1-'C

Fig

. 1.

Flo

w o

f in

form

atio

n in

a d

istr

ibut

ed s

yste

m.

g~

R. Kushwaha / Performance of distributed and parallel systems 193

During this period of time any other requests for the same file are queued up.

The request may follow any one of the sub- chains described above. The chance of a request following a particular subchain depends on three conditions: (1) the state of the file at the server (free or

busy). (2) the availability of memory space at the client. (3) the status of the requested file at the client

(active or non-active). Let p be the probability that a file is in busy

state at the server, ~ be the probability that the cache is full and ~" be the probability that a file is in non-active state. Figure 1 shows the flow of information units in the system.

2.2. The queueing network model

The distributed system as described above is modeled as a network of queues formed by a collection of servers and clients, interconnected by a queueing network (Fig. 2).

The physical nodes in the distributed system (clients, servers, network) are represented as ser- vice centers in the queueing network model. Each information unit (user-requests, files, messages, etc.) is modeled as a customer of a particular class. Customers entering the system change classes as they circulate among service centers and follow one of the chains shown in Fig. 1. The level of modeling is analogous to the multi-pro- gramming and multi-processing description of computer systems as proposed by Gelenbe and

Mitrani [5], in which computer resources are modeled as service centers and computing jobs are modeled as customers. Simular modeling of distributed systems is also found in [4]. Modeling of distributed and parallel systems at this level is appropriate in the early stage because perfor- mance issues can be studied without worrying too much about the details of the system.

The queueing network model is composed of following three types of service centers [1,5]:

(1) Single processor-shared server (network). For the network, customers of all classes (mes-

sages and files) are made up of an arbitrary number of packets. All packets are served on FCFS basis. Thus all customers are being served in parallel and share one common processor.

(2) Multiple-server (servers). This service center has more than one server

to service jobs waiting in a common queue. If no more than n jobs request service from an n-server service center, all jobs will receive service imme- diately without queueing. If there are more than n jobs, n of them can be serviced at one time, and others have to wait in the queue.

(3) Infinite-server (clients). This service center models the clients, where

the number of clients is always greater than or equal to the number of jobs. Thus, no job visiting this service center will have any queueing delay.

The queueing discipline and service demands are such that they satisfy product form require- ments [1,5,6,8]. The numerical values for the spe- cific measures of system performance, such as node utilizations, throughputs, average response

Clients Servers

Network

im

$2

Fig. 2. Queueing network model.

194 R: Kushwaha / Performance o f distributed and parallel systems

time, etc., can be extracted from the solution for the stationary distribution state of the network queueing model [5,13].

For simple illustration of the prediction method, avoiding lengthy equations, two simple but realistic assumptions are made. First, to lo- cate a file in the distributed system, a simple but efficient approach of static maps [15] is used. In static mapping, part of the file name is used to identify the server. The simplest approach is to have a number as a suffix (or a prefix) for each file name. The number maps to a particular server and is stored in a table on the client's local memory. This assumption violates location trans- parency; however, other sophisticated addressing mechanisms that provide location transparency can easily be incorporated in the model. Second, files are not replicated for the model presented here, however the prediction method allows to evaluate performance of the systems which pro- vide file replication strategies.

3 . A n a l y s i s

Let S be the state of the system and n be the total number of customers in the system. If n = (n 1 + n 2 + rt3) , where nl, n 2 and n 3 are the num- ber of customers at servers, the network and clients, respectively, then the steady state distri- bution can be obtained by summing p ( S ) over all states S which yields n [5]. Let 1 / t Z i r be the average service time of class r customer at the network and client node, and 1//xin i be the mean service time of the exponential distribution at the server node, when n i customers are present. As- suming the system is closed with respect to all customer classes, and the server, the client and the network nodes are type 1, 3 and 2 service centers respectively [1,5,6,8], the equilibrium probabilities at steady state are given by [1,8]:

p ( n) = ( 1 / G ) Pl( n , ) p2( n2) p g ( n3)

In this equation marginal probabilities pi(ni) are defined as follows: for the server node

eir t ni (1)

for the network node

P 2 ( n i ) = ( ~ r eir]nil&ir ] (2)

for the client node

P 3 ( n i ) = - - ~ i l ' ( ~ r ]'£irei'"~r) ni ( 3 )

The service time distributions are arbitrary Coxian for the client and network nodes, and assumed to be exponential for the server node. The quantity e;, is proportional to the total ar- rival rate of class r jobs into node i and is interpreted as the relative arrival rate of class r customers to the service center i [1,5,6,8]. The existence of the steady state distribution, for a closed network, depends on the solution of the following set of flow equations:

eir = Y'~ejsPjs,i r (4) j ,s

( l / G ) is a normalizing constant which must be calculated over all possible states [10,12,16]:

N G ~ ni ........ ' ni' )

The flow equations (4) obtained from Fig. 1 are

ec,rM 1 = et,rM 3 q- es,rM 2 -t- et,rM 3 q- et,rp

es,rr = ( 1 - - P ) e t , r M 1 et,rF = es,rr

et,rM 3 = ec,rM 3 es,rM3 ~- et,rM 3

et,rF ~ ec,rF es,rF ~ et,rF

et,rM 1 = es,rM 1 ec,rM 2 = ( 1 - - "r)et,rM

et,rM 1 ~ ec,rM 1

ec,rM 3 = et,rM F

ec,re = Ti'et,rF

es,rM 1 : Pet,rM 1

et,rM 2 ~ ec,rM 2 es,rM2 = et,rM ~ et,rM2 ~ es,rM 2

ec,rF = "l'et,rM 1 et,rF ~- ec,rF ees,r F ~ et,rF

et,rF ~ es,rF ec,rM3 ~ et,rM F et,rM3 ~ ec,rM 3

es,r~t3 = et,rM 3 ec,r F = ~ e t , r F et,re = ec,rF

es,rF = et,rF

One solution can be obtained by setting ec,,M z = which gives

ec,rM 1 = 1 et,rM 1 = 1

ec,rM 3 = (1 - p ) et,rM 3 = (1 - p )

e,,rp = ~'(1 - p ) e,,rp = ~'(1 - p )

ec,rM 2 = ( 1 -- z ) p et, ,M 2 = ( 1 -- z ) p

ec,rF = "l'p et,rF = "l'p

ec,rM 3 = "l'p et,rM 3 = ~'p

et,rF = 7r~'p es,rF = 7rzp

,

R. Kushwaha / Performance of distributed and parallel systems 195

e,,rr= (1 --p) et,rg= (1 - - p )

e,,rM3 = (1 - -P) ec,rF=Tr(1--P)

es,rM 1 =P et,rM 1 =P

es,rM z = ( 1 - r ) p e,,rM z = ( 1 - z ) p

es,rr = "cp et,rF = 7p

es,rM 3 = Tp ec,rF = 71"3"13

The marginal probabilities of network, server and client nodes can be computed as follows:

(1) For network node, from (1),

P l ( n l ) = (--Eet'rm-k-l-La --~-~et'rF) n l ] ' t b

where 1//~ a is the average time to transmit one message (1 packet), and 1 / ~ b is the average time to transmit one file (multiple packets). All jobs, messages (rMO and files (r F) are t reated as pack- ets. The number of packets depends on the file- size in the case of a file (rF). The file-size is uniformly distributed. Summations F_,et,~M and Eet,rF are the relative arrival rates of messages and files respectively, at the network.

Eet,rm = 2 + 2p -p"r

Ee,,,~ = (1 - p ) + rr(1 - p + rp) + 2rp

(2) For clients, from (2),

1 ( 1 ~_~ec,rm Eec , re] n2 = - - - - + - - + P 2 ( H 2 ) n 2 ol ]/ ~1 ]

where 1/a is the average think time, and 1/fl 1 is the average disk transfer time (to process a file). These times include the rotational a n d / o r the seek delays and the network access time. 1/y is the average time taken to service a message (r M) by both a client and a server; this t ime includes the network access t ime for the message.

Y'~e . . . . = (1 - p ) + (1 - ~')p + rp

E e c , r F = 71"(1 - p ) + rp + 7"r'rp

(3) For server node, from (3),

P 3 ( n 3 ) = ( E e s , r M + Ees,rF)n3 1 [.£ n 3

w h e r e 1/tZn3 is the average time needed to ser- vice a customer and depends on the number of

customers (n3).

Ee; , rM=l+p ]F_~e,,rF=(l+~-)(1-p+rp)

The normalizing constant can be computed as

G = G2(n ) = E E 1--I(Pl(nl)P2(n2)P3(n3))

since n = n~ + n 2 + n 3 (closed system) Y/--n 1

G = E ~ 1 - - I (P l (n , )P2(n2) n2=0 n l - 0

×P3(n - n I - n 2 ) )

The throughputs, utilization factors and aver- age response times for different nodes are calcu- lated in terms of G and eir [5].

To simplify the model analysis, without loss of generality, all the servers are considered identical in their behavior, with equal probability to service a request. Thus, if q is the probability of locating a file at any server and there are three servers, the probability of finding a file at one server is q/3. Moreover, if T is the number of total files in the system, there are T/3 number of files associ- ated with each server. We assume that files are not replicated and that they are located at servers and file-copies at clients (no files in transmission).

To calculate steady state probabilities p, 7r, and z, we assume that only one file is active at each client. For N clients, each one of them having F number of files in its cache, the proba- bilities can be approximated as follows: (1) the probability that a file is busy at server:

p = ( N * F ) / T

where the average number of files at client's cache is F = cache + s i ze / ave rage_ file _ size.

(2) the probability that the cache is full:

N.F--1 ( T ) _ p ) T - k 7 r = l - • p*(1

k=N*F-F

(3) the probability that a file is not active at client:

r = 1 - probability file active at client

= 1 - 1/F

These probabilities at steady state and e's yield marginal probabilities and normalizing constant G. The performance measures (average response time, throughputs, etc.) are obtained from G and e 's .

196 R. Kushwaha / Performance of distributed and parallel systems

4. Results and validation

4.1. Numerical example

The prediction method described above is gen- eral; it can be used to estimate performance of a system with any number of servers, clients and files. The estimated numerical results can be eas- ily validated against the ones obtained from simu- lations. For validation purpose a numerical exam- ple of a distributed system with three clients and three servers was considered. We considered 600 files, 200 assigned to each server. Small number of files were chosen to introduce more conflict among different file requests. The file size was uniformly distributed from 500 bytes to 3000 bytes. The network transmits packets of 1000 octets (bytes) at a rate of 10 Mbps. The disk transfer rate at a server and the cache transfer rate at a client are 1 M b / s and 40 Mb/s , respectively.

We are interested in the behavior of the sys- tem under heavy loads. The performance of the servers, of the network and of the complete sys- tem is measured by increasing the rate at which the user submits file requests, i.e., decreasing think time.

As the clients are loaded more heavily by the user's file requests, the utilization of a server and that of the network increases at a constant rate, as shown in Fig. 3. From the curves we make specific, although approximate, predictions about the network and server performance.

1

0.9

0.8

0.7

0.6

0.5

" ~ 0.4

0.3

0.2

0.1

0

/// I I I I I I n I n

0.3 0.6 0.9 1.2 1.5 1.8 2.1 2.4 2.7

I/think time Fig . 3. N e t w o r k / s e r v e r u t i l i za t ion : [] [] n e t w o r k ,

<> - - - - - - <> s e r v e r .

500 , , , ,

460

420

380

340

300 o 8260

I ~ 220

180

140

100 0

i I i I /

I _ /

/ /¢e"

/ /

,.4> j

Fig . 4. N e t w o r k / s e r v e r p e r f o r m a n c e : []

O - - - - - - O s e r v e r .

0.21 0.42 0.63 0.84 1.05 1.26 1.47 1.68 1.89 2.1

1/think time [] n e t w o r k ,

With a think time of 2 s, we have network utilization of about 15%. Decreasing users' think time by 60% raises the network utilization to 55%. In other words, 60% more file requests will raise the throughput by nearly 200%, with only 20% degradation in response time (Fig. 4).

However, decreasing think time by a further 60% increases the network utilization to 80%, thus raising the throughput by only another 45%, but the response time increases by nearly 83%. Hence, for three servers and three clients, to achieve good performance, the network utiliza- tion should be restricted to 60%. As the client or server population increases, so does the utiliza- tion and response time for the network.

Similar analysis can be done for the server node and individual servers. Further observation of Figs. 3 and 4 and shows that even though the network is much faster than an individual server (the response time of network is much less than that of an individual server), the utilization of the two is almost the same. This is due to inherent parallelism in serving the file requests by the servers.

The average response time, which is defined as the interval between the file request arriving at a client and returning back to it, increases as the rate of file requests arriving at the client in- creases. The results of the model with two and three servers are shown in Fig. 5. Decreasing the user think time by 60% for the model with two servers increases the response time by 2.5%. A further 60% reduction in think time results in

R. Kushwaha / Performance of distributed and parallel systems 197

le+03

940

880

aa 820

"~, 760

7O0

~- 640

580

/ /

520

460 ~ - B 4 , $.o. . .~/

4 0 0 I I I I I I I 0 0.3 0.6 0.9 1.2 1.5 1.8 2.1

1/think time Fig. 5. System per formance: [ ] - - [] server = 2,

0 - - - - - ~ server = 3.

I I 2.4 2.7 3

46% degradation in the system average response time. Any further decrease in think time will increase the response time asymptotically.

Increasing the number of servers improves the response time of the system. The system perfor- mance increases by nearly 14% when the number of servers increases from two to three. Each additional server speeds up the request process- ing. However, the network capacity limits the speed-up achieved by adding more servers.

4.2. Validation

The performance prediction method described in the last section has been used to estimate the behavior of a distributed system. This section presents the results of the evaluation of the pre- diction method in terms of accuracy of its predic- tion. The accuracy of the method is established by comparing the estimates of the prediction method to the statistics collected from detailed simulations.

An event driven stochastic simulator, for the distributed system, was written in C on Sun micro system. During each simulation run, queue- lengths, marginal probabilities (probability that n customers are at a node), utilizations and throughputs of system nodes are collected. The differences between the queueing network model and the simulated model are as follows:

(1) In the queueing network model, the service time distributions at the server node to service a

file request are assumed to be exponential; realis- tically, however, the service times depend on the file size. For simulation purposes, the service times are uniformly distributed on the interval 2000 to 4000, average file size being 3000 bytes.

(2) The queueing discipline at the server node is considered to be FCFS for the file requests. In simulations, some modifications have to be made in order to make the system work correctly. Whenever a requested file is busy at the server, the server forwards the request message to some client. Any other requests for the same file must wait at the server until the file copy reaches the requesting client. In the meantime, the server services all the other file requests on FCFS basis. Hence, server deviates from FCFS discipline for file requests requesting the same file.

(3) To obtain the numerical values for marginal probabilities in analytical model, steady state probabilities (p , ~- and 7) have to be approxi- mated as discussed in Section 3. However, simu- lations do not use steady state probabilities to direct the flow of customers. Initially all the files are free at their respective " h o m e " servers and caches at the clients are empty. As simulation proceeds, clients request files from the server, and the server services the request depending on whether file is busy or free. The simulation statis- tics were obtained from 6000 file requests from each client.

(4) In simulations, the network node treats each file and message as multiple packets. Each file or message is made up of an arbitrary number of packets depending upon the file-size and mes- sage type. File sizes are assumed to be uniformly distributed. The service time distribution for one packet is deterministic. The Network node in the

Table 1

Marginal probabili t ies

tion results

for each server compared with simula-

n a Analytical Simulation

Server 1 Server 2 Server 3

0 0.507 0.507 0.503 0.506 1 0.194 0.202 0.191 0.200 2 0.109 0.108 0.099 0.101 3 0.050 0.054 0.052 0.053

4 0.027 0.031 0.029 0.033 5 0.011 0.012 0.010 0.009

a n is the number of customers at the server node.

198 R. Kushwaha / Performance of distributed and parallel systems

0.6 0.57

.._. 0.5,~ O.5] 0.4[ 0.4.'

o~ 0.42 • - 0.39 ~ 0.36 ~ 0.33 ,I~ 0.3 0 ~ 0.27

0.24 0.21

,gll O. 18 0.15

~ 0.12 ~ 0.09

0.06 0.03

0 0

I I I I I

I I I I

1 2 3 4 5

Jobs of all classes (n) Fig. 6. D e g r e e of fit: D - - D analyt ical , ~ - - - - - - ~

server = 1, • - - • se rver = 2, <> . . . . . O server = 3.

0.1 0.095

O.OS 0.085

0.04 0.07.'

0.07 , ~

"~ 0.065 .=m

0°0 0.05

0.045 0.04

-~ 0.035 0.03 . . I

0.025

o°o 0.01

0.005 0

! I I I I

L \

\ \

1 2 3 4

Jobs of all classes (n)

d.

5 6

Fig. 7. D e g r e e of fit: D - - Q analyt ical , ~ - - - - - - ~

s imula t ion .

queueing model treats each file or message as a single customer, while service time distributions are arbitrary Coxian. Analytically the marginal probabilities are calculated using (1), (2) and (3). To obtain these marginal probabilities, approxi- mated steady state probabilities, as discussed at end of Section 3, are used. However, in simula- tion we do not use the steady state probabilities to obtain marginal probabilities.

(5) In the queueing network model, routing information is not required. In simulation, rout- ing information is required in order to simulate the transition of customers from one node to another.

The marginal probabilities of messages (r m) and files (rF), for each node, are considered for comparing the results of the simulations with the analytical model. Analytically the marginal proba- bilities are calculated using (1), (2) and (3). Queue-length statistics yield these probabilities for simulations. The degree of fit between the

Tab le 2 M a r g i n a l p robab i l i t i e s for ne twork c o m p a r e d wi th s imula t ion results .

n Ana ly t i ca l S imula t ion

0 0.089 0.080

1 0.019 0.019 2 0.007 0.004 3 0.002 0.001 4 0.001 0.000

5 0.000 0.000

a n is the n u m b e r of cus tomer s a t the ne twork node.

mathematical results and simulation model out- put is the key for validating the results.

(1) Server: In both models, simulation and analytical, all the servers are identical in terms of operating speed, number of files associated and other operating behavior. Table 1 shows the ana- lytical and simulation marginal probabilities, P(n), where n is the number of customers at the server. Figure 6 depicts the degree of fit among the numerical values.

(2) Network: The network treats each file and message as multiple packets. If packets are con- sidered as customers in the simulation model instead of as files and messages, the network resembles, in its behavior, a node of an open-net- work system [1,8,12] as opposed to a closed one. The marginal probabilities from the simulation model are compared against the ones from the analytical open-model. Table 2 presents the marginal probabilities, P(n), for the network, where n corresponds to the number of packets. Figure 7 shows the degree of fit between numeri- cal values.

5. Approximate model for cache size analysis

Now we turn toward the more difficult prob- lems of calculating optimum cache size and de- termine how cache size affects the performance of the entire system. Due to the interdependence of the queues at the network and servers, it is difficult to obtain an exact optimum cache size.

R. Kushwaha / Performance of distributed and parallel systems 199

An approximate method is used for analyzing its size and its effect on the performance of the system.

First, we analyze how cache size affects the performance. Each client of the "client node" generates a file request which enters the system and moves around from station to station (clients are service stations also) according to transition probabilities, change classes and eventually re- turns to the client as a file. The group of clients is no longer modeled as "server-per- job" node in approximated model, but instead each client is t reated as a single server with M / G / 1 queue, as shown in Fig. 8. Decreasing the think time, hence increasing the number of user-requests per unit time, will N O T assign a separate client (server- per-job strategy), but a user-request will queue up at a client instead. The total time taken to service a file request (response time) is approximately equal to the total service provided by the net- work, server and other clients, when the job circu- lates among different nodes. However, the num- ber of visits to each node and the path taken by the request depend on the steady state probabili- ties, p and ~-.

Let A s be the average time to transfer a file from a server to a requesting client, A c be the average time to transfer a file from some client to the requesting client, and A b be the average time to transfer a file from a client back to a server. The average service time to serve a file request, assuming there is no wait at network and server, is

E [ S ] = p A s + (1 - p ) A c + ~-zl b (4a)

where

A s = 1/x a + 1 / /31 + 1//,~ b

A c = 1 / / x a + l / y + ] / / x a + 1 / / 3 2 + l / / x b

+ 1//31 + 1 / # b

A b = 1 / / x b + 1 / /31

To see why the above equations are correct, consider a client requesting file A, which is lo- cated at a server, and file B, which resides at some other client. In the first case, the network passes the message in 1//x a time to the server and the server transfers the file in 1//31 time to the network. The network takes 1//1. b time to pass the file to the requesting client. The total t ime taken is represented by the equation A s =

1 / / x a + 1//31 + 1 / / x b. In the second case, the file request is directed

to the server (1/tXa). Since the file is located at some other client, the server passes the message to the client (1 /y + 1//Xa). The client transfers the file (1//32) to the s e r v e r ( l / / X b ) , and the server makes a copy and sends it to the request- ing client (1//31 + 1//Xb). Hence the total time, A c = 2 / / x a + 1 /y + 1 / / 3 2 + 2 / / x b + 1//31.

Since all the wait in the approximate model is at a client, it is fair to represent each client by a single server ( M / G / l ) . Without considering the customer class distinction, performance of this model can be measured in terms of average num- ber of customers in the queue at the client. Assuming the scheduling discipline for jobs of all classes as FIFO, the average number of jobs waiting at the client can be approximated by the

Clients

C1

" _ , . - - - ~ c 2

C~

Fig. 8. Approximate model.

200 R. Kushwaha / Performance of distributed and parallel systems

Pollaczek-Khintchine's formula [7] as follows:

0 2 E [ S 2 ] q = p + - - (4b)

2(1 -p)

where 0 (throughput of each client) = (throughput of the client node ) /number of clients; through- put for the client node is calculated in terms of G and ec, Section 3; p (the utilization fac tor )= 0E[S]; and E[S 2] is the second moment of the average service time (4a).

The queue length, hence the performance, de- pends on the average service time E[S], which is calculated in terms of p and ~- (4a). These steady state probabilities are obtained in terms of cache size (Section 3). Thus keeping all the other pa- rameters the same, varying cache sizes will affect the queue length.

The effect of cache sizes was determined by varying cache size from 3000 bytes (representing an average of one file in cache) to 1,000,000 bytes (representing a large number of files). A cache is typically used to reduce the average access time for data storage and retrieval. Besides increasing the cache-hit probability (hit-ratio), cache size affects other system probabilities. Simulation re- suits in Fig. 9 show that as the size of each cache increases, so does the probability (p ) of finding the file busy at the server. This is because, at the steady state, if cache sizes are large, more files

I . i i i l •

o.91, / o I' ~ 0.8

~ 0.'/

f- 1 °-3r ', 1

°.'1¢ ', -I 3e+03 2.02e+05 4.02e+05 6.01e+05 8.01e+05 le+06

Cache Size Fig. 9. Effect of cache size on probabilities: [] [] probability that the file is busy, o - - - - <> prabability that

the cache is full.

.=.

o t _ ,

.<

0.5

0.48

0.46

0.44

0.42

0.4

0.38

0.36

0.34

0.32

°1 o3 I I I I 4.08e+04 8.06e+04 1.2e+05 1.6e+05 2e+05

Cache Size Fig. 10. Effect of cache size on performance: []

simulation, ~ - - - - ~ , analytical.

[]

are located at clients' cache and are marked as busy at the server.

If the cache sizes are considerably large, clients behave like servers (data location changes from servers to clients), and servers turn out to be merely communicating nodes of the system. This increases communication delays and hence dete- riorates performance. The analytical measure, queue length, validates this assertion. The queue length increases as the cache size increases. Fig- ure 10 shows both analytical, from (4b), and simu- lation results.

The queue length does not increase indefi- nitely; for a particular cache-size the curve flat- tens (Fig. 10). This is because the probability 7r that a cache is full approaches zero (Fig. 9). As the caches are never full, no service is performed to remove the data from the cache, which com- pensates for extra work done in getting the data from other clients. Figure 11 shows the effect of the cache size on the average system response- time.

Next, from our simulation results we obtain optimum cache size for the given system parame- ters. The cache of approximately 100,000 bytes is optimum (Fig. 11), as smaller cache sizes than 100,000 provide a high response time and larger cache-sizes do not improve the performance. The cache-size analysis in the distributed and parallel systems, however, is much more complex and also

R. Kushwaha / Performance of distributed and parallel systems 201

150, , ,

135

" ~ 9 0

75

860 ID

45

30

0 I I I 3e+03 5.22e+04 1.02e+05 1.5 le+05

Cache Size Fig. 11. Effect of cache size on r e sponse t ime: []

s imula t ion .

2e+05

rn

generated job enters the "rest of the multiple service station network" (the terminals are ser- vice stations also), and moves around from sta- tion to station according to the transition proba- bilities, eventually returning to the terminal at which time the user generates a new job. Each service center (resource) has an arbitrary service time distribution.

Let T be the average response time to pass through the rest of the network and 1/A be the average time in the terminal node. The average cycle t ime is then T + 1/A and the system throughput is A '= M / ( T + l / A ) cus tomers /s .

D

Let N = E [ n u m b e r of jobs in the rest of the system] and M = E[number of jobs in the terminal node]

depends on a number of other factors, like read, write, update polices and disk scheduling [3].

6. Appl icat ion

In multiple server systems serving requests from a set of clients (terminals), the identification of bott leneck server and the computat ion of opti- mum c l i en t / se rver ratio are known problems. We apply the methodology described in the paper to obtain the solution to these problems.

Our approach is to study the system from the terminal 's point of view, as shown in the Fig. 12. We have a multiple server system serving M terminals (clients). We have a total of K = M customers, each of which generates a job from the terminal at a rate of A jobs /s . Each such

By Little's result T = N / A ' , since M = N + M and T = m / h ' - M / A ' .

For Client node (terminals), for one customer M / A ' = l /A , thus

T = M / A ' - 1/A (5)

Let s be the bott leneck or saturated server in the rest of the network. Then e, = Y'.resr, the relative number of visits of all job classes at the sth node, is maximum in the system. If for the client (terminal) node, e c = ~ recr, then the aver- age number of times bott leneck is visited for each visit to the terminal node is e s / e c.

Assuming for M >> M * ( M * is the number of clients or terminals which makes the server s saturated), when the server node is beyond satu- ration, the output rate of server s is approxi- mately ~s. Thus the output rate of jobs from the

Terminal Node

);

s - Saturated Server Fig. 12. Mul t i -c l i en t mul t i - se rve r model .

202 R. Kushwaha / Performance of distributed and parallel systems

rest of the network can be represented as i~J(es/e¢) , i.e.,

A, = i.~ec/e ~

Substituting the value of ,V in (5), we get

Me~ 1 T = M >> M * (6)

iXse c h

This is asymptotic behavior for T when M >> M *; it is linear with M at a slope eJ t z se c.

To find the number M * , we argue that it must be equal to the maximum number of perfectly scheduled jobs that cause no mutual interference. If all service times are assumed to be determinis- tic, then the maximum number of jobs at bottle- neck node equals the total service required by a job in a cycle /service time spent by a job in the saturated node per cycle.

The total service required by a job per cycle is Y'~eg(1/i.Li)/e~.The service time spent by a job in the saturated node per cycle is [e,(1// .%)]/e c

~_~ei(1/!,zi) /ec M * = (7)

e~(1/tXs) /ec

Eei(1/I-t,i) M * -= (8)

es(1//Zs)

For M = 1, total service time required by a job in

a cycle = T + l /A , from (7)

( T + l / A ) M *

es (1 / l~ ) /ec

M ' e , 1 T - (9)

ecg , h

From (6) and (9) the asymptotic behavior of the multi-client system can be predicted as shown in Fig. 13.

I f we remove the bottleneck by increasing the service rate of the server s or by adding another server to the system, some other node, denoted by s ' , will become the new bottleneck, and the asymptotic behavior will again be similar to that in (6) and (9) with a new slope es,//zs,e c and new saturation number M * ' . In fact, if we continue this procedure of removing bottlenecks, we will always expose a new one with slope eg,//zi,ec as sketched in Fig. 13, where M * must be recalcu- lated with the new rates.

For numerical results, we considered a dis- tributed system with 50 clients and 20 servers. There were total of 10000 files, distributed equally at each server. Average file size was 2000 bytes and the cache size of 100,000 bytes was consid- ered for each client. The disk transfer rate of 10 M b / s was assumed for each server.

T

e._s S . -

~t s ec..-"

. - " e i,

.__ . . . . . . . . . . . . . . . . . . . . ec

M* M*' M

Fig. 13. Asymptotic behavior of multiple resource system.

R. Kushwaha / Performance of distributed and parallel systems 203

600 ' ' ' ' ' ' ' ' ' d "

540

480

~a 420

"~" 360

~ 300 ,

I ~ 180 ~ / / . l - "

120 * / . . - a " ~ - -

6O

0 I ~ '~ " ~ "1~ l I I , I I , 5 10 15 20 25 30 35 40 45 50

Number of clients F i g . 14. P e r f o r m a n c e o f m u l t i p l e r e s o u r c e s y s t e m : [] []

s e r v e r = 5, ~ - - - - - - ~ s e r v e r = 10, • • s e r v e r = 15,

o . . . . . ~ s e r v e r = 20.

The performance of the system, for think time of 1 s, is shown in Fig. 14. The response time decreases considerably when the number of servers increases from 5 to 10. However, for the given system parameters, results suggest that it is not worth to invest in more than 15 servers, because increase in the number of servers from 15 to 20 does not show significant improvement in the performance.

The behavior shown in Figs. 13 and 14 is very important for distributed and parallel system de- sign since we can predict the number of servers required, for a particular number of clients and for a desired response time. Analysis also pre- dicts the improvement in system performance if we decide to invest in more powerful resources.

7. Conclusion

This article has described a method for evalu- ating various performance measures and analyz- ing design parameters for a class of distributed and parallel systems. The main advantage of this method is its applicability to very general models. Using this approach, distributed and parallel sys- tems with multiple classes, multi-tasking and job spawning can easily be modeled. The success of the method rests on the following key properties: (1) The queueing model is of product form type

or can be approximated as such. (2) The results are easily verifiable by using other

modeling techniques, like simulation.

(3) Design parameters, such as cache size, client-server ratio and data-location, can be analyzed.

The numerical results have demonstrated the efficacy of the method and have indicated its potential use. The results presented in this study provide a comprehensive set of guidelines to model the systems with multiple nodes behaving as clients and servers. This can be used to com- pare different systems and different resource al- location policies (e.g., how many servers should be installed in the system, or which server's load should be shared). The impact of load on job delays can be predicted, and, more generally, this method allows capacity planning (e.g., how fast the network should be in order to handle the workload).

The analysis provided here is only for identical servers and identical clients. For a larger number of interacting, non-identical nodes, the mathe- matics becomes more complex and the most fruit- ful approach to understand the interaction among asynchronous processes would be the approxi- mate solutions [9]. The method presented here easily can be extended to account for several modifications to the basic model. Furthermore, the use of multiple customer classes can also be applied to model processes which may suspend execution at one site and resume execution at another site (process migration).

Acknowledgement

The author is indebted to Prof. Erol Gelenbe, sponsored chairman at New Jersey Institute of Technology, for his guidance and constructive discussions throughout this research. He also re- viewed the technical report which greatly helped to improve the presentation of this paper. Thanks to Dr. Robert Lynch, Prof. in Humanities De- partment at NJIT, who read the final version of the paper and made valuable comments and cor- rections.

References

[1] F. B a s k e t t , K . M . C h a n d y , R . R . M u n t z a n d J . P a l a c i o s ,

O p e n , c l o s e d a n d m i x e d n e t w o r k s o f q u e u e s w i t h d i f f e r -

e n t c l a s s e s o f c u s t o m e r s , J. ACM 22 (2) (1975) 2 4 8 - 2 6 0 .

204 R. Kushwaha / Performance of distributed and parallel systems

[2] D.R. Brownbridge, L.F. Marshall and B. Randell, The Newcastle connection of Unixes of the world unite, Soft- ware Practice and Experience 12 (1982) 1147-1162.

[3] S.D. Carson and S. Setia, Analysis of the periodic update write policy for disk cache, IEEE Trans. Software Eng. 18 (1) (1992).

[4] E. De Souza and M. Gerla, Queueing networks model for load balancing in distributed systems, J. Parallel Dis- trib. Comput. 12 (1991) 24-38.

[5] E. Gelenbe and I. Mitrani, Analysis and Synthesis of Computer System (Academic Press, New York, 1980).

[6] E. Gelenbe and G. Pujolle, Introduction to Queueing Networks (Wiley, New York, 1987).

[7] L. Kleinrock, Queueing Systems, Vol. 1: Theory (Wiley, New York, 1975).

[8] L. Kleinrock, Queueing Systems, Vol. 2: Computer Appli- cations (Wiley, New York, 1976).

[9] L. Kleinrock, On distributed systems performance, Com- put. Networks ISDN Systems 20 (1990) 209-215.

[10] R.R. Muntz and J. Wong, Efficient computational proce-

dures for closed queueing networks model, Proc. 7th Hawaii International Conference on System Sciences, Hon- olulu, HI, 8-10 January 1974, pp. 33-36.

[11] G.J. Popek and B.J. Walker, The LOCUS Distributed System Architecture (MIT Press, Cambridge, MA, 1985).

[12] M. Reiser, Numerical methods in separable queueing networks, IBM Research Report, RC 4145, IBM Thomas J. Watson Research Center, Yorktown Heights, NY, 1976.

[13] M. Reiser and H. Kobayashi, Queueing networks with multiple closed chains: theory and computational algo- rithms, IBMJ. Res. Develop. 19 (1975) 283-294.

[14] M. Satyanarayanan et al., The ITC distributed file sys- tem: principles and design, Proc. 9th ACM Symposium of Operating System Principles, 1983, pp. 35-47.

[15] B. Welch and J. Ousterhout, Prefix tables: a simple mechanism for locating files in a distributed system, IEEE CH2293, September 1986.

[16] J. Wong, Queueing network models for computer sys- tems, Ph.D. Thesis, University of California at Los Ange- les, 1975.