IJCA Azadmanesh, Zhou, & Peak 2003

14
VOLUME 25 / NUMBER 4 12003

Transcript of IJCA Azadmanesh, Zhou, & Peak 2003

VOLUME 25 / NUMBER 4 12003

CO-EDITOR-IN-CHIEF

Prof. B. Furht Florida Atlanclc University, Dept. of Computer Sciences & Eng. Boca Raton, FL 33431, U.S.A. Multimedra Sysrems, Real-Tme System, Computer Architecture

CO-EDITOR-IN-CHIEF

. Dr. Leone C. Monticone MITRE Corp. 1820 Dollv Madison Blvd ~ a i l S t o ~ ' ~ 3 8 7 McLean, \!A 22102, U.S.A. Network design. Nrtwmk rellabiliry, Mathematical programming, Neural nets

EDITORIAL BOARD

Prof. R. Bisiani

Prof. I. Maghoub Depc. of Computer Science and Engineering Florida Atlantic University Boca Raton, R 3343 1, U.S.A. Multiprocessors, Parallel Computers, Disnibuted Systems

Prof. M. Savic Dept. of Electrical, Computer and Syst~ Engineering Rensselaer Polytechnic Institute JEC 7002 Troy, NY 12180-3590, U.S.A. Digital Sipnnl Processing, Neural Networks -

Dr. N. Marovac Dept. of Mathematical Sciences San Diego State University San Diego, C A 92182, U.S.A. Disnibuted Systems and LANs. Graphicr , Educnrion

Prof. J.D. Mrng Lawrence Berkeley Laboratory University of California Berkeley, C A 94720. U.S.A. Multiprocessing Systems cmd their Architectures, Interactive Processors and Dev~ces

Prof. N. Sharda Department of Computer & Mathematical Sciences Victoria University of Technology PO Box 14428, MCMC Melbourne, Victoria 8001, Australia

Prof. W.V. Subbarao Dept. of Electrical Engineering Florida International University University Park Campus Miami. R 33186. U.S.A. Micsoprocessor Architectures

Department of Computer Science Camegie-Mellon University Pittsburgh, PA 15213, U.S.A. Task Onented Architectures , C u s m VLSI, Speech Recogninon

Dr. M. De Santo DIllE - University of Salerno 84084 Fisc~ano (SA) - 139 Italy

Prof. E. Fernandez Dept. of Electrical & Computer Eng. Florida Atlantic University Boca Raton, Florida 33430, U.S.A. Fault-Tolerant System, Software-Oriented Computer Architectures

Dr. D.P. Gluch Sofnvarp F n o i n r ~ r i n o lnsricuce

Prof. V. Milutinovic School of Electrical Engineering Un~versity of Belgrade 11000 Belgrade. Yugoslavia VLSI Architecncses, Advanced Microprocessors

Prof. D. Moldovan Head, Dept. of Computer Science and Engineering Southern Methodist University Rrn SIC 300 Dallas, TX 75275-0122, U.S.A. Multiminocomputers

Dr. G. Papadopoulos Dept. of Computer Science Universitv of C v ~ r u s

Prof. M. Tapia Dept. of Electrical and Computer Eng University of Miami P.O. Eox 248194 Coral Gables, FL 33121, U.S.A. Fault-Tolerant Computer Architectures , Multiprocessing Systems

Dr. B. Veeravalli Dept. of Electrical and Computer Eng. The National University of Singapore Singapore 119260 Web-cachmg, Computer Archiucture , Dismbuted Compunng

Prof. L. Welch Computer Science & Engineering Dept. The Universiw of Texas at

~arne~ie- illo on University 74 Kalllpoleos Str. Box 19015 Pittsburgh, PA 15213, U.S.4. P.O. Box 537, CY-1678 Arlington, T X 76019.0015, U.S.A. Real-7ime Computer Architectures, Nicosia, Cyprus Red-Time Sysrems , Concuwent Fault-Tolrront System, Artificial Neural ParaUel and Disrributed Programming Programming, Parallel Computing Network Computing Models and Languages, Multimedia

Systems, Declarnnve (Logic and Dr. I. W u Prof. 1.-L. Houle Functional) Programming Dept. of Electrical & Computer Eng. Ecole Polytechnique Florida Atlantic University Universite de Montreal Dr. D. Petkovic Boca Raton, FL 33430, U.S.A. C.P. 6079, Succ. "A" IBM Research Laboratory Computer Architecture,

1. Guebec. Canada H3C 3.A7 K54-282, Distributed Computing . - Computer Structures and Architectures, 5600 Cottle Road Applications of Mini and Microcomputers San Jose, C A 95193, U.S.A. to Telecommunicatiar and Conrrol Image Processing und Pattern Recognition Engineering Applications, Software Engineering,

Multimicroprocessor S y s m Prof. W. Kinser Dept. of Electrical and Computer Eng. Dr. G. Racherla

1 7 I-: ..-..=.-, nf M2nirnh~ General Atomics

CONTROL An?) mTELLIGENT SYSTEhlS (201) - ISSN: 1480-1752 Subject area: This journal covers all aspects of control theory and its applications, intelligent systems, fuzzy logic, neural net- works and genetic algorithms.

, . .3 &;2)" >+ ' c % - . ,

J O ~ & S 7 EDITORS , . . -

Editor-in-Cltie$ Pro$ Clarence 1;1< de Silva Department of Mechanical Engineering, University of British Columbia, 2324 Main Mall; \?ancouver, BC, Canada V6T 124

E-mail: [email protected]

-4 letter must accompany all submissions, cle&ly indicating the ' . follouring: the corresponding author's complete name, afliliation and address with phone, fax, and e-mail numbers, if available. One copy of your submission (double-spaced) plus a . electronic copy should be sent directly to the ACTA Press office (NOT to

INTERNATIONAL JOURNAL OF COMPUTERS AND APPLICATIONS (202) - ISSN: 1206-212X Subject area: This journal covers all aspects of contemporary computers and their applications including technology, hard- ware and software systems, networking and conlmunications, multimedia systems, and the Internet as well as microcornput- '

er applications such as engineering, sclence, business, management robotics, medicine, manufacturing, and the humanities.

ACTA Press # 80, 4500 - 16th Ave. NMr Calgary, AB T3B OM6 Canada E-mail: calgag)@actapress. corn Web site: www. actapress. corn

Co-Editor-in-ClrieJ Dr. B. Furht Department of Computer Sclence and Engineering, Florida Atlantic University, Boca Raton, FL 3343 1. USA

E-mail: [email protected] Co-Editor-in-Chiefi Dr. L. C. Monricone MTTRE Corp., 1820 'Dolly Madison Blvd., Mail Stop W387, McLean, VA 22102, USA

E-ma.il: [email protected]

an editor) at:

INTERnTATIONAL JOURNAL OF PO\VER AND ENERGY SYSTEMS (203) - 1SSN: 1078-3466 Subject area: This journal covers transmission and distribution, electric power quality, education, energy developn~ent, com- petition and regulation, power generation and electronics, communications. electric machinery, power e.~~gineering systems, control, reliability, quality control, economics, modelling and simulation. alternative energy sources, policy and plamming.

Editor-in-Chiefi Pro$ A. Domijan, Jr. Florida Power Affiliates, Power Quality & Power Electronics Lab.. Department of Electrical and Computer Engineering, 3 14 Benton Hall, P.O. Box 116200, Universmty of Florida, Gainesville, FL 3261 1: USA

E-mail: alexdaece. ufl.edu

INTERNATIONAL JOURNAL OF MODELLING AN?) SIMULATION (205) - ISSN: 0228-6203 Subject area: This journal covers all aspects of modelling. simulation, languages, software, hardarare: methodology, numeri- cal and grapl~ical methods, virtual reahty: statistical techniques, tutorials, surveys, applications.

Editor-in-Clticfi Prot A. ~oushyar Dept. of Industrial 6: h4anufacturing Engineering, Western h4icl~igan University: Kala~nazoo, MI 49008-5061: USA

E-nzail: hou,shj~ar@.~)mich.edu

I?r'TERNATTONAL JOURXAL OF ROBOTJCS AND ALTTOMATXOIi (206) - ISSN: 0826-81 85 Subject area: This journal covers all aspects of robol~cs and auton~at~on. including modelling, sin~ulations, dynamics, design. control, sensors; vision, speech and object recognition. Image processing: safety: reliability: econon~ics, social implicatmons 11ld applicatio~~s.

Editor-in-Cizie$ Prof: M. Kun~el Department of Systems Design Engineering, Unm~~ersjty of Waterloo, M'aterloo, ON: Canada N2L 3G]

E-mail: [email protected] .

International Journal of Computers and Applications, Vol. 25, No. 4, 2003

EGOPHOBIC VOTING ALGORITHMS A. Azadmanesh,* L. Zhou,* and D. Peak**

Abstract

Approximate agreement is one form of distributed agreement,. in which the processes in executing a voting algorithm are required to agree on values that are very close to each other. Recent studies have partitioned the voting algorithms into three broad categories: anonymous, egocentric, and egophobic. Among these categories, Egophobic algorithms have not yet been studied. This article con- siders one family of Egophobic algorithms. They are studied under a hybrid-fault model consisting of asymmetric, symmetric, and benign faults. We obtain their convergence rate and fault-tolerance expressions, show that these algorithms cannot discriminate between asymmetric and symmetric faults, and compare their performance against that of algorithms in other categories. The study of. Egc- phobic algorithms will further research into forming a unified model capable of addressing any algorithms in these categories.

Key Words

Approximate agreement, distributed agreement, Byzantine agree- ment, clock synchronization, convergent voting algorithms, hybrid- fault models

1. Introduction

Many applications require the nodes1 in a distributed system to reach agreement on a particular event or value, such as the Commit protocol in a distributed database system, or electing a leader in a network. In contrast to this form of agreement, where the input and output values held by the nodes are discrete, the nodes in a system may start with nondiscrete values and are supposed to eventually decide on output values within a small positive real value of each other. Some application areas of such agreement are the management of redundant sensor data and fault-tolerant clock synchronization [I-51. This form of agreement, called approzimate agreement, must satisfy the following two conditions [6,7]:

Agreement: The voted values of any pair of nonfaulty processes are within E of each other.

' Department of Computer Science, University of Nebraska at Omaha, Omaha, NE 68182-0500, USA; e-mail: azadQahvaz.ist. unomaha.edu, linjieQhotmail.com

*. Department of Business Computer Information Systems, Uni- versity of North Texas, Denton, T X 76203-5249, USA; e-mail: peakQcobaf .coba.unt .edu

lHenceforth, the words node and process are used interchange- ably.

(paper no. 202-1 224)

Validity: The voted value for each nonfaulty process is strictly within the range of the initial values of the nonfaulty processes.

Traditionally, studies in approximate agreement assume faults behave as worst-case Byzantine faults. How- ever, in real-world systems, the vast majority of faults do not behave in Byzantine manner ('7-121. Hence, the results tended to be overly conservative, leading to unnecessarily complex system designs. Recently, to remedy this difficulty, some studies have used hybrid-fault models. These studies include the work done by Fekete [13] and by Azadmanesh and Kieckhafer [7, 8. 14).

Fekete [13] analysed the behaviour of omission and Byzantine faults. However, this study is not based on a true hybrid-fault model. Specifically, it considers three separate single-mode fault models, each dealing with a different failure mode (crash-failure, failure-by-omission, or Byzantine).

In a different study, Kieckhafer and Azadmanesh (71 used a hybrid-fault model that allows for simultaneous existence of three failure modes (benign, symmetric, and asymmetric). They presented a methodology to measure the performance of any voting algorithm within a family of voting algorithms called Mean-Subsequence-Reduced (MSR). This work was later extended to include the class of omission failure [8,14]. In a related work, Azadmanesh and Krings [15] used the three failure-mode model for a diEerent family of voting algorithms that belongs to the class of Egocentric algorithms [2,16,17]. In this family of algorithms, the processes place more trust on their own values than the other received values. Consequently, if an error is detected or a value is not received, a process tends to correct the error by replacing it with its own value. This article investigates the counterpart form of these algorithms, called Egophobzc algorithms, where the processes place more trust on values other than their own. These algorithms are often used for hardware clock syn- chronization [1&20]. In general, Egophobic algorithms, as well as Egocentric algorithms, can be used in applications where the processes may not all produce exactly the same value. Thus, a certain amount of discrepancy among p r e cesses must be tolerated. The amount of this discrepancy depends on the application. The discrepancy might be, for instance, due to sensors reading the same input, clocks that must stay within a predefined known bound of each other, or multiple software or hardware versions where decision algorithms are employed to determine the final vote from similar but not identical results [2,4,5,21,22].

236 I

As no general analysis of Egophobic algorithms exists in the literature, this study will derive the performance and fault tolerance for such algorithms. In addition, the analysis will be under the same hybrid-fault model used for E g e

ltric algorithms. The treatment of Egophobic algorithms dnder the same fault model will allow for easier comparison of such algorithms against Egocentric and MSR algorithms.

We analyse the extreme case of Egophobic algorithms, that is, each node replaces the erroneous values by a value furthest from its own. As a result, regardless of the value used to replace erroneous values, we will be able t o present the range of performance for any egoestic (Egocentric or Egophobic) algorithm.

Section 2 provides some background and definitions used throughout the article. Section 3 provides the cat- egories of algorithms and describes their differences. Section 4 explains the effect of different failure modes on Egophobic algorithms. Section 5 provides some prelimi- nary lemmas. The results of Sections 4 and 5 are then used in Section 6, which obtains the worst and optimal con- vergence rate and the fault tolerance for these algorithms. Section 7 compares these algorithms against other voting algorithms. Section 8 concludes the article with a summary and'some directions for future research.

2. Background and Definitions

2.1 System Model

We consider completely connected networks consisting N nodes; and assume the timing of communication is

synchronous. In a synchronous communication system, the processing and communication delay of nonfaulty p r e cesses are bounded. There is thus a point in time when every process will have received all data from nonfaulty processes before it executes the voting algorithm. Any data arriving after this time are considered to be from a faulty process.

2.2 Fault Model

We use the hybrid-fault model of Thambidurai and Park [Ill. This model partitions the total number of faults t in the system into b benign, s symmetric, and a asymmet- ric faults. Thus, t = a + s + b. Benign faults, also called manifest faults, are self-incriminating or self-evident t o every nonfaulty process. Examples of benign faults include out-of-bound data, data received outside of the bounded communication delay, and crash faults. A symmetric fault is defined as a fault whose erroneous value is perceived iden- tically by all receiving nonfaulty processes. An asymmetric fault occurs when the nonfaulty processes receive codic t - ing messages, due to either faults in the communication network or the sending process maliciously transmitting conflicting values to nonfaulty processes.

; Real-Valued Mult isets -- - .

.4pproximate agreement requires the manipulation of mul- tisets of real values. A multiset is a collection of objects

similar in concept to a set. However, it differs from a set in that all elements of a multiset are not necessarily dis- tinct. A multiset of real numbers can be represented as a monotonically increasing sequence of the real values of its elements, that is, V = (vl, . . . , vv) is ordered such that v, 5 v,+~,Vi E (1 ,..., V - 1) [23,24]. The size of V is V = IVI. The following operations are useful on multisets:

p(V) = [min(V), max(V)] = [vl, vv]; the real interval spanned by V. p(V) is called the range of V.

b(V) = max(V) - min(V) = vv - vl; the difference between the maximum and minimum values of V. d(V) is called the diameter of V.

mean(V) = The arithmetic mean of the real values of all elements of V.

Subsequences: Consider two nonempty multisets U and V , where U S V. U is a subsequence of V if there is an order-preserving one-to-one mapping k, from the indices of U to the indices of V , that is, u, = vk(j), V j € {I,. . . , U} and k(j) < k ( j + I ) , V j E (1,. . . , U - 1).

Dominance: Consider two equal-sized nonempty multisets W = (wl,. . . , vu;) and W' = .(w',, . . . , w k ) . W dominates W! if and only if w, 2 wi, Vi E (1,. . . , W). Dominance has the property that it is preserved under the subsequence operation, that is, given two equal-sized multisets V and V', let U be a subsequence of V , and U' be the same subsequence of V'. If V dominates V', then U dominates U'.

2.4 Single-Step Convergence

During the course of the agreement process, each nonfaulty process i executes the same voting algorithm in rounds as follows. Process i broadcasts its initial value to all prG cesses, including itself. Then it collects all the values it has received into a multiset N i . If process i does not receive a value from some process, or if it detects that a received value is erroneous, process i simply picks some default value to assign to that process in the multiset. The final multiset is called the voting multiset V,. Process i then applies a function F to Vi to attain its latest estimate (voted value) for the round, which is used as the value of the process t o be broadcast in the next round. The objective of approximate agreement is achieved if it is guaranteed that the diameter of the voted values among the cor- rect processes becomes smaller after each round of voting. This property, called single-step convergence, ensures that the diameter of the correct values will be eventually less than E.

To formally define this property, define U,u as the multiset of all correct values received by nonfaulty p r e cesses. Then, an algorithm is single-step convergent if both of the following conditions are true following every round of voting:

Validity: For each nonfaulty process i, the voted value is within the range of correct values, that is, F(V,) E

~ ( U a l l ) .

237

Convergence: For each pair of nonfaulty processes i and j, the difference between their voted values is strictly less than the diameter of the multiset of correct values received, that is, IF(Vi) - F(Vj) 1 < b(Ua10.

The effectiveness of a convergent voting algorithm is measured by its convergence rate C. Assuming that b(Uau) > 0, C is the maximum possible value of the ratio:

If C < 1 in each round, i t is then guaranteed that after enough rounds, the system will achieve the agreement condition, that is, the diameter of correct values held by the nonfaulty processes i n the system will be within E .

3. Categories of Voting Algorithms

Known voting algorithms can be partitioned into three broad categories: anonymous, egocentric, and egophobic. Fig. 1 shows three families of these algorithms, which are called MSR, Mean-Subsequence-Egocentric (MSE), and Mean-Subsequence-Egopho bic (MSEP). There might be other categories of algorithms, which are shown in the figure as Others.

For MSR algorithms, each process uses the following function [7]:

The Reduction function RedT removes the T largest and the T smallest elements from multiset Vi. T is the number of

- faults to be tolerated. The Selection function Sel, selects a submultiset of a elements from the reduced multiset, that is, from the multiset generated by RedT(Vi). F(Vi) is the mean of these selected elements.

MSR algorithms can use any value as a default. Therefore, the default value is inherently erroneous. 0.n the other hand, the Mean-Subsequence-Not-Reduced

Approximate agreement - voting algorithms

Anonymous - MSR

Egocenmc - MSE

Egophobic - MSEP

MSR: ~ C a a - s u b ~ ~ u e n c t - ~ e d u ~ t d

MSE: Mean-Subsequent-Egcanmc

- . MSEP: M-Subsequence-Egophobic MSNR: Mean-Subsequence-Not-Reduced

Figure 1. Categories of approximate agreement voting algorithms.

(MSNR) algorithms, which contain the egocentric and --. egophobic categories, attempt to replace a missing value or a detected error with a value within the range of correct values. These algorithms use the following approximation function:

For a process i to detect that a value in N; is erroneous, the value must be apart from the process's own value by more than a predefined tolerance range cp. Otherwise the value is accepted as a correct value, even if it is generated by a faulty process. Therefore, MSNR dgorithms are distinguished from MSR algorithms in two aspects:

. MS.NR algorithms do not use the Reduction function. The range of the initial correct values must be known a priori by all processes.

Because of this second aspect, MSNR algorithms do not need to use the Reduction function, whose sole purpose is to generate a submultiset whose range is always within the range of correct values. Thus, if a and P are the values of two arbitrary nonfaulty processes, cp is the diameter of correct values, and v,,h and v,,~ are the hth and lth values in the voting multisets V; and V,, respectively, then these algorithms have the following properties:

Depending on how one chooses the default value, two fam- ilies of MSNR algorithms have been identified. They are called MSE and MSEP. In MSE algorithms [15], each p r e cess uses its own value as the default value. In other words, a nonfaulty process prefers to use its own value for bad or missing values. This work completes the performance analysis of MSNR algorithms by concentrating on MSEP algorithms. For MSEP algorithms, each process prefers to use any default value other than its own value. Although this may not seem to be a big difference in comparison to MSE algorithms (as the default value is still within the range of the correct values), it will be shown that MSEP algorithms need a higher lower bound on the minimum number of processes required to achieve convergence, and have lower performance rate than egocentric algorithms.

An important parameter to MSNR algorithms is 7. The purpose of y is t o show whether a voting algorithm is convergent. If so, it will determine the convergence rate. (The merits of y will be clear by Theorem 1 and the convergence rate expression.) The following describes how y is obtained.

Definition of y: The selected multiset S = SeZ, (V) = (sly . . . , so) is a subsequence of V = (vl, . . . , Thus, each element of S corresponds to one unique element of V. Now, let g be the index of any element of S: and let k(g) be the index of the corresponding element in V, so that s, = ~ k ( ~ ) , for each g E (1, . . . , a).

Given two indices into S, g and h E {1, . . . , a ) , where g 5 h, define Ak(g, h) = k(h) - k(g) as the number of ele ments in V spanned by elements (s,, . . . , sh) in S. Thus,

Ak(g, h) is the number of elements of V in the submultiset ( ~ k ( ~ ) + l , . . . I ~ k ( h ) b

Finally, for any non-negative integer t, 7, is defined as the minimum value of (h - g) that ensures that z

.merits of V are spanned, independent of g and h, that 6, is the minimum value of (h - g) that ensures that k(h) - k(g) 2 z , for any values g and h, g 5 h 5 a. By this definition, yz exists only if IVI > z.

The following pseudecode shows how yz is obtained:

7 , - - 1 , I t O

WHILE ( I 5 u - 1 AND 7, = -I){

IF (for every 9 E (1,. . . , u - I}, Ak(g,g + I ) 2 z )

7 z I I + I + l

1 By definition, 7, cannot be negative. Thus, y, does not exist if it is negative at the end of the pseudo-code.

4. M S E P Voting Algorithms

During a voting procedure, messages may be lost, delayed, or corrupted. Thus a process needs to pick up a default value to replace these messages. The default value used by Egophobic algorithms is a value other than the process's own value. Thus, a process, in a sense, places more trust on other processes' values [15,17,18]. Among the W t e

mber of acceptable default values that a process i can woose , that is, horn the range (a - cp) to ( a + cp), the minimum and the maximum acceptable default values, ( a - cp) and ( a + cp), are most distinguishable. MSEP algorithms use either the minimum or the maximum values for their default values. It will be shown that no Egophobic algorithm can perform worse than MSEP algorithms. As the discussion of these two cases is symmetric, here we focus on the case that uses the maximum acceptable values as a default, that is, the sum of the process's own value ' and the predefined tolerance value cp.

4.1 Fault-Mode Analysis

4 .1 .I Benign Faults

A large percentage of the faults occurring within dis- tributed systems are benign faults. Because benign faults are self-evident to all nonfaulty processes, the obvious solu- tion is to simply ignore them. Thus the voting algorithms are executed with no benign values in the voting multisets [6,7]. Therefore, it is only necessary to address convergence using the multiset of values not containing benign errors. The size of V is hence n = N - b.

and vj,l are the values sent by a process k to processes .' .

i and j , respectively. Furthermore, assume process k is symmetrically faulty. By definition, the default values for process i are (a + cp) and for process j is ( p + 9).

If vi,h and vj,l are both in the selected multiset of i and j , then there are four cases to consider, in order to find the maxlF(Y) - F(Vj)l:

1. If 1vi,h - a1 5 cp and Ivj,1 -01 5 9 , both v,,h and vj-1 are within the tolerance range, and they are thus accepted by processes i and j. As process k is symmetrically faulty:

2. If (vi,h - a1 < cp and Ivj,l - PI > cp, process i accepts vi,h but process j rejects v,,~, as it is outside of the acceptable range. Thus, vj,l is replaced with (P + cp). B e C a ~ ~ e a ! - c p < ~ , , ~ < ( ~ + c p a n d -cp<(a-p)<cp:

3. If J v , , ~ - a1 > cp and IvjSl - PI < cp, process j accepts v,,r, and process i rejects v,,h. Then:

4. If I~,,~-al > cp and JvjJ -PI > cp, process i replaces v,,h with (a + cp) and process j replaces vj,l with (P + cp). Thus:

As a result, among the four cases, cases 2 and 3 cause the maximum discrepancy between i and j.

4 .1 .3 Asymmetric Faults .

If process k is an asymmetrically faulty process, vi,h and vj,l are not only incorrect but also do not hold the same value. Depending on the behaviour of process k, four different situations are possible:

1. If 121i.h - L Y ~ 5 (P and Ivj,l -PI 5 50, both values v,,h and vj,l will be accepted by i and j . For this case, the maximum and the minimum acceptable values for vi,h and vj,l are (cr + 9) and (p - cp), respectively. Thus:

.2 Symmetric Faults Ivi,h - Vj.11 5 1 ( a f 9) - (P - V) 1 - J

- .

Recall that vi,h and vj,r are the hth and lth values in the = 129 + (a - P)l voting multisets Vi and V,, respectively. Assume that vi,h I 39

239 c. L.

2. If 1vi,h - a( 5 cp and IVj,l - PI > cp, process i accepts the value, but vj,l is replaced with (P + cp). Thus:

3. If Iv,,~ - a( > cp and Ivj,l - PI < cp, process i replaces v,,h with ( a + cp), but process j accepts the value vj,l. Thus:

4. If Ivith - a1 > 9 and Jvj,l -PI > cp, both values vi,,, and vj,l will be replaced with the default values ( a + cp) and ( p + q) , respectively. Thus:

Among the four cases, cases 1, 2, and 3 produce the maJtimum difference.

5. Preiiminary Lemmas

Lemma 1. Let f and n be non-negative integers, such that f < n. Assume an MSEP F(V), such that V = (vl,. . . , v,) and vi 5 T, where T is the upper bound of values allowed in V. Assume V can be partitioned into disjoint multisets U = (ul, . . . , u,,- f ) and X = (xl, . . . , x f ) such that U is a multiset of fixed values and X is a multiset of variable values. Then, F (V) generates the maximum possible value if xk = T, Vxk E XI given U.

Proof. Let V = (U+X) = (211,. . . , u.,,-f,x~,. . . , xf) , satisfying the hypothesis xk = T, Vxk E X. Let V' = (U + X'), where X' is any other variable multiset other than X with at least one element less than T. Let the first such element be x',. Let v, be the lowest numbered element in V for which vg # vi. Then, vi =xi. The insertion of z', into V' shifts the subsequent elements of V' such that:

Therefore, V dominates V'. Because dominance is pre- served under subsequence functions, the dominance of V over V' implies that Sel,(V) dominates Sel,(V1). As a result F(V) 2 - F (V') .

Lemma 2. Let f and n be non-negative integers, such that f 5 n. Assume an MSEP F(V) , such that

240

V = (ul, . . . v,) and v; 2 I , where I is the lower bound of values allowed in V . Assume V can be partitioned into disjoint multisets U = (ti1, . . . , 21,- f) and X = ( x l , . . . , xf) such that U is a multiset of fixed values and X is a multi- set of variable values. Then, F (V) generates the minimum possible value if xk = I , Vxk E X, given U.

Proof. The proof is similar to that of Lemma 1. '

To determine the convergence rate C, we need to find the maximmi difference between two voted values belonging to two arbitrary nonfaulty processes i and j. It is obvious that the maximum difference between F(V') and F(Vj) is reached when F(Vi) and F(Vj) reach their maximum and minimum values, respectively.

Furthermore, to find C, one needs only to find a worst- case scenario that maximizes IF(V,) - F(V,) I . When we look at the worst-case scenarios produced by cases 2 and 3 for the symmetric faults and cases 1, 2, and 3 for the asymmetric faults, we see that case 3 is common to both failure modes. Case 3 indicates that regardless of whether the worst-faulty behaviour originates from a symmetrically or an asymmetrically faulty process, the maximum discrep ancy, 39, is achieved when CY = 0 + cp. For this case, we also observe that process i replaces the erroneous symmetric value by its default value (a + 9). As a result, symmetric values are behaving like asymmetric faults, because pro- cesses i and j ultimately use two different values for a symmetrically faulty process.

According to Lemma 1, all the values of elements in U are fixed, which can be interpreted as the submultiset of correct values common to both i and j. Furthermore, based on the same lemma, X is the submultiset of variable values. These values can be interpreted as the submultiset of erroneous values, because these values may not be common to both processes i and j. Because ( a + cp) is the largest tolerable value for process i , F(Vi) is maximized only when all of its elements in X are equal to (a + cp). The value ( a + cp) corresponds to case 3. Using a similar argument, according to Lemma 2, F(Vj) will be minimized when all the elements in its submultiset X are equal to the lowest tolerable value (P - cp). This value again corresponds to case 3 for both symmetric and asymmetric faults. Based on these observations, the maximum and minimum of Vi and V, are achieved when:

Vi = (Uinj, A', C') = (a-cp,a ,..., a,a+cp ,..., a+cp) . (2)

v, = (4 C, qn3)

= ( a - 2 9 ,..., a-2cp,cr-cp ,..., a-cp) (3)

where:

U,n, = the multiset of correct values received by hots processes z and j 1

.a A' = the multiset of asymmetric errors used by proces4

i but not by process 3. The multiset contains t default value used by process z. that is, (cr + for those values received that are outside of tolerance range. Thus, IA'I = a.

A = the multiset of asymmetric errors received by process j. Thus, IAJ = a.

C = the multiset of symmetric errors received by process j

C' = the multiset of values used by process i as a result , of symmetric faults. This multiset is the same as

C except that the values that are outside of the range are replaced with the defauIt value (a + cp) used by process i.

In (2), the first n - (a + s) elements, ( a - cp, a , . . . , a ) , belong to U,n,, of which the fist two values ( a - cp) and a are held by the processes j and i, respectively. The rest of the elements, (a + cp, . . . , a + cp), are the asymmetric and symmetric errors whose values have been replaced by their default value ( a + cp) (see case 3). Similarly, the first (a + s) elements of (3) having the values (a - 2cp) belong to A and C (see case 3). These are the symmetric and asymmetric values that were transmitted as (p - cp) and thus were replaced by i with (a+cp). Note that in the worst :ase p = a - cp, and thus p - cp = (Y - 2p. The rest of the dements in (3) belong to Usn,. These multisets are referred ;o as virtual multisets because there are elements in V, md Vj that are treated as correct but are not common to 30th processes i and j. For example, the second element )f V,, a, which is the value of process i, is not in V.. This s to ensure that, regardless of whether s , , ~ and/or s,,~ are :orrect or erroneous, (si,, - s , ,~) is maximized.

The following lemma takes advantage of the results in 2) and (3) to find the m&um of 1 F (V, ) - F (V,) I .

remma 3. Let V, and Vj be two voting multisets of an dSEP voting algorithm. Then:

Proof. According to. (2) and (3):

Vi = (U;nj, A', C') ( 5 )

?ow let us define W = A + C + U,nj + A' + C', which as the complete ordering:

W = (A, C, Uinj, A', C')

lapping the indices of W into the indices of the omponents of W yields:

(wl, . . . , w,) = A = (a1 , . . . , a,)

(wa+lt...,wa+s) = C = ( c ~ r . . - r ~ s )

(~a+s+l t . - wn) = Uinf = (211, - . . I 'h-a-s) (w-+~ , . . . , Wn+a) = A' = (a:, . . . , ah)

(wn+a+lr - wn+ats) = C' = (c:r . . r c:)

.)plying these components to (5) produces:

= (wa+s+lr.. . r wn+a+s)

= (wlI . . . ,wn)

Thus, the selected multisets Si and S j can be written as: -

S j = (sj,l , . . . , Sj,a) = Selu(Vj) = Sel, ((~1, . . . , w,)).

(7)

Now consider the elements s , , and sj,h of (6) and (7), respectively:

If Ak(g , h) 2 (a + s), then Sj,h 2 siPg. Thus, based on the definition of y,, by replacing h with g + T(,+~), Ak(g, g + T(,+~)) > (a + s). Hence:

As a result:

0 0-7(0+.)

Using (8), Si and S j can now be expanded into:

S j = (sj,l, . . r Sj,7(.+.) I Sj,7(.,+.)+11 - . . I so) (10)

For the sake of readability, the subscript of 7 will not be shown, but T(,+,) will be implied. Now, without loss of generality assume that F(Vi) 2 F(Vj). Using (9) and (10):

Due to (8), EL7+, sj,h > Cil: sing. Therefore, the max- imum of (11) is obtained when C:,+l sj,h holds its lower bound value, that is, when Czr7+1 sj,h = Ciz: Si.9. Hence, (11) becomes:

In (12), depending on the Selection function, it is not clear what the value of (~ i ,u-~+g - sj,,) will be. However, accordhig to (2) and (3), in the worst case the maximum and minimum of ~,,,-,+1 and s j , are ( a +cp) and ( a - 2p), respectively. Furthermore, by the indices in (12), we can see that s,,,-,+l refers to the last 7 elements of the selected multiset and s , , refers to the first selected elements. This observation is instrumental in obtaining the convergence rate, as is seen below.

To obtain a numerical value for the expression in (12), assign weights to the worst-case values contained in multisets V, and Vj. In (2) and (3), let the elements with values ( a - 2cp), ( a - cp), a, and (a + cp) carry the weights 0,. . . ,3, respectively. As an example, in (12), if

= a - 2q, then the weight of the selected element is 0. Using these weights, V;. and Vj can be converted into the following enumerated lists:

where:

1: k(x) = 1

2: 2 1 k ( s ) < ( n - a - s ) ' ' fW 3: otherwise

and:

Now using the sum-expression in (12), define the weight function w, , t o be:

w.here the value of z for the expression in (12) is (a + ij. As an exaxnple, if g = 1 and k(o-g+ 1) = n, then e+-g+l =3, which implies that s,,,-,+I= ~, ,k( , , -~+l) = vi,n, which in the worst case is ( a + cp) (see Vi in (2)). Similarly, if- g= 1 and k(g) = 1, then e , , =O. In the worst case sjSg = ~ , , k ( ~ ) = /3 - V ) = a - 2cp (see Vj in (3)). As a result:

6. Performance of MSEP Model

6.1 Convergence Rate

The foUowing theorem takes advantage of the discussion above to obtain the convergence rate.

Proof. Lemma 3 showed that:

In the worst case, it was defined that (see (16)):

= cpw, (18)

Using (17) and (18):

6.2 . Fault Tolerance

Recall that, for readability, the subscript of 7(,+,) is not shown. Theorem 1 showed that C < w,/a. This relation implies that a voting algorithm is convergent if a > w,.

Recall that N = n + b. To determine the lower bound of N for which a voting algorithm exists, consider the case when all elements from V are selected, so that u = n. The selection of every element implies k(g) = g, g < n. Thus, 7 = (a + 8 ) . Using (13) and (14), ei,,-,+~ = 3 and ejlg = 0, for 1 I g < (a + s), respectively. Consequently:

Accordingly, n 2 (3a + 3s + 1) ensures that C < 1. By incorporating the impact of benign faults, a convergent voting algorithm exists only if:

Theoremk. 'Given an MSEP voting algorithm F(V): N ~ 3 a + 3 s + b + 1 (20)

The lower bound on N for MSR and MSE models is: N 2 3a + 2s f b f 1. Hence, MSEP algorithms have lower fault

tolerance. This conclusion is due to the fact that symmetric faults in the worst case behave exactly like the worst case for asymmetric faults. Therefore, these algorithms do not benefit from the Thambidurai and Park fault model [11], hat is, partitioning the malicious faults into symmetric

'and asymmetric fault modes. This is consistent with our fault effectiveness analysis, case 3, that both types of faults cause the maximum 39 discrepancy between two nonfaulty processes.

6.3 Optimal Convergence Rate .

Theorem2 . The optimal convergence rate for any MSEP voting algorithm is:

Proof. To obtain the best performance, we need to find the minimum value for w,,Ju. This occurs when w,, and u are minimized and maximized, respectively. In (15), w,, is minimized when 7, = 1 and [e,,o-g+l - ej,g] = 1. TO ensure that [ei,,-,+I - e,,g] = 1, ei,,-g+l and ej,g must be either 1 and 0, or 2 and 1, respectively (see (13) and (14)).

In (13), if ei,,-g+l = 1, then k(x) = k(u - g + 1) = 1, which implies that u - g + 1 = 1. Thus, in the expression:

when g = 1, u - g + 1 = 1 indicates that u = 1. However, for the algorithm to be convergent, a must be at least 2; otherwise w,,/u > 1. Therefore, the only choice remaining is for ~,,-,+l and ejBg t o be 2 and 1, respectively. In (13), when ei,,-,+l= 2, 2 5 k(u - g + 1) 5 n - (a + s) indicates that the selected element is not in the last .(a + s) elements of Vi. Similarly, in (14), when e j , = I, the selected element ,: is not in the first (a + s) elements. As a result, for the expression in (21) to evaluate t o 1, 7, must be 1 and the selected elements must be in (v(,+,)+~, . . . , vn-(,+,)) of V. This submultiset has the length n - 2(a + s).

On the other hand, Kieckhafer and Azadmanesh [7] showed that for any multiset MI the maximum number of selected elements, while retaining 7, = 1, is [ ( M - l)/zJ + 1. Thus:

7. Comparison of Voting Models

Several known voting algorithms have been anal- ysed under both singlemode and hybrid-fault models [2,6,7,12,15,?5]. This section compares the performance of these algorithms to that of MSEP algorithms. To bet- ter understand how to obtain the performance of these algorithms, we explain the process for the Fault-Tolerant ,

Midpoint and Fault-Tolerant Mean. Then a table is given that summarizes the performance of MSR, MSE, and MSEP algorithms.

Fault-Tolerant Midpoint: The Fault-Tolerant .Mid- point selects only the two extreme values of its multiset values, that is, vl and v,. Thus, u = 2 and 7= 1. In addi- tion, the selection of vl and v, implies that k(u) = k(2) = n and k(1) = 1, so that q,, = 3 and ej,l = 0. As a result:

As u < w,, C > 1. Consequently, the Fault-Tolerant Mid- point is not convergent under the MSEP model.

Instead of selecting vl and v,, however, let us remove the largest and the smallest (a + s) elements and then select the extreme values, that is, v(,+,+l) and v,-(,+,I. As k(u) = k(2) = n - (a + s), according to (13), when g = 1, ~ , , - ~ + 1 = ei,2 = 2. Similarly, as k(1) = (a+s+l) , according to (14), when g = 1, ejVg = 1. Therefore,

As a result, C = w,/u = 112. This is the convergence rate of Fault-Tolerant Midpoint previously obtained under Dolev's single-mode and Kieckhafer's MSR fault models [6,7,25].

Fault- Tolerant Mean: This algorithm selects all ele- ments of V. Thus, u = N - b. F'urthermore, as Ak(g, g + 1) = 1, for g 5 ( a - I), 7(,+,) = (a + s). According to (13)

and (14), ~ ~ ~ ) [ G , , - ~ + I - ej,g] = 3, for 1 < g < (a + s). Thus:

When the total number of faults in the system is t = a + s + b :

The single-mode fault model has the convergence rate [6]:

- - 1 We observe that the MSEP fault model performs worse ,V-2(a+s)-b-l 1 ia+sj Is1 than the single-mode fault model if benigns are not the

dominant mode of failure. However, if (a+s) elements from 0 the extreme right and the extreme left are not selected,

243

Table 1 Comparison of Convergence Rates

NC =not convergent; Mad-Point = Fault-Tolerant Midpoint; FT-Man = Fault-Tolerant Mean; S-M Opt. = Single-Mode Optimal; M-M Opt. = Mixed-Mode Optimal [6,7].

Voting Algorithm

Mid-Point

FT-Mean

S-M Opt.

M-M Opt.

C

Table 2 Comparison of Fault Tolerance among Byzantine, MSR, MSE, and MSEP Fault Models

Fault Model

then [ei,,-g+l - ejtg] = 1. Thus:

Byzantine

1 I

t

1

1

2 0

Fault Model

FaultTolerance

- a + s - - t - b - ( N - b ) - 2 ( a + s ) N - 2 t + b (23)

which shows much faster convergence than (22). The Fault- Tolerant Mean under the MSR fault model discards the extreme (a + s) elements from both sides of the voting multiset as well. It has the convergence rate (71:

MSR

1 3

a N-2t+b

1

1 mw* 2% 0

By comparing (23) and (24), we observe that unless s = 0, MSR fault model shows better performance. The reason for worse performance is that in the worst case MSEP cannot distinguish between symmetric and asymmetric faults.

Table 1 shows the convergence rates under different fault models and voting algorithms. The Byzantine fault model is the single-mode fault model where every fault is considered Byzantine. MSR, MSE, and MSEP fault mod- els use the hybrid-failure modes consisting of asymmetric, symmetric, and benign fault modes, so that t = a + s + b. MSEPl stands for the original algorithms, where every ele- ment in the voting multiset, which is of size N- b, is a candi- date for selection. MSEPz stands for those algorithms that do not select any element from the extreme (a+ s) elements in the voting multiset. Thus, the size of the submultiset is N - 2(a-+s) - b . MSEr and MSEz are defined similarly [Is].

MSEP share some characteristics in performance with MSR and MSE algorithms (7,151. But because the effects of symmetric faults do not distinguish themselves from the

Byzantine

N 2 3 t + 1

effects of asymmetric faults, the performance of MSEP algorithms is not better than that of MSR. In comparison to MSE algorithms, it has been shown that IdSE algorithms perform their best when no elements from the right (a + s) and left a elements are selected [IS]. These algorithms perform better than MSEPl and MSEP2 algorithms.

Among the algorithms that place a bound on the diameter of correct values, that is, la - 5 cp, the MSE and MSEP algorithms presented herein use the extreme default values. Specifically, in MSE model, processes use their own values for the default value, whereas MSEP algorithms use the extreme counterpart value, that is, the default value is the process's own value plus cp. ,As a result, among the nonreduced voting algorithms, regardless of what the chosen default value might be, the performance of Egocentric and Egophobic algorithms will always be within the performance range of MSE and MSEP algorithms.

Table 2 summarizes the minimum number of processes required to guarantee the existence of a convergent voting algorithm. We can see that the benefit of MSEP over the Byzantine fault model is only in the benign faults. Thus, the MSEP is actually a two-mode fault model similar to the fault model of Meyer and Pradhan [9], partitioning the faults into malicious versus nonmalicious faults. MSE and MSR have the same fault tolerance, and their minimum requirement on the number of processes is better than that of the Byzantine fault model.

MSEl N - b

NC

30+2s N-6

Wl(-+.) u

MSE

N z 3 a + 2 s + b + l

MSR

N > 3 a + 2 s + b + l

8. Conclusion

MSEP

N z 3 a + 3 s + b + 1

In some applications, a certain amount of discre? ancy among processes must be tolerated. Knowing tu discrepancy amount, by following a number of singlestep3

244

MSE:! N - 2 ( a + s ) - b

1 2-

t-b N-2t+b

* W,(a+s-l

u

MSEPl N - b

NC

N-b

L " - . : . " " i . ; i i N - b - 1

-- I- b

W,(a+sl u

MSEP2 ~ - 2 ( a + ~ ) - b

1 - 2

t-b N-2t+b

N-21+b-1 + 1 * L-b

w l c - + s ~ u

convergent rounds, MSNR algorithms allow for the pro- cesses to further converge their results to within a small positive real value E .

Two extreme-end families of MSNR algorithms are dSE and MSEP. This article has focused on MSEP

algorithms. These algorithms, although used in some applications, were never analysed under a general method- ology to better understand their performance, or to be able to compare their performance against other vot- ing algorithms. Therefore, this work has derived general expressions that easily determine the performance and fault tolerance of any of these algorithms, rather than analysing each algorithm individually.

There are several directions in which to expand upon this work. MSEP and MSE algorithms together complete the study of MSNR algorithms. Therefore, it is of ben- :fit to devise a unified model to encompass MSNR and 'VISR algorithms. Furthermore, MSNR algorithms use the :xtreme default values. It would be interesting to know low the performance is affected if the default value is nei- ;her of the extreme values, to potentially further reduce ;he discrepancies among voted values in each round, for ?xample, if the voting function can dynamically update ;he default value based on the knowledge gained from the ;&far received values.

Finally, knowing the values E , cp, and the worst con- fergence rate, each process can determine the number of .ourids to converge. As the convergence rate in a round can )e better than the worst-case convergence rate, the issue is inding a methodology to determine the convergence rate I each round dynamically. Such a solution will mostlikely

,educe the number of voting rounds to reach convergence.

Phis work was supported by the NASA Space Grant Con- iortium and by the Information Science and Technology IS&T) College of the University of Nebraska at Omaha UNO).

ieferences

[I] R.M. Kieckhafer, C.J. . Walter, A.M. Finn, k P.M. Thambidurai, The MAFT architecture for dhributed fault- tolerance, IEEE lhnsactions on Computers, 37(4), 1988, 398405.

(21 L. Lamport & P.M. Melliar-Smith, Synchronizing clocks in the presence of faults, Journal of the ACM, 32(1), 1985, 52-78.

[3] J. Lundelius & N. Lynch, A new fault-tolerant algorithm for clock synchronization, P m . of the 3rd Symp. on Pnncaples of Dutributed Computing, Vancouver, British Columbia, Canada, Aumst 1984. 75-88.

[4] F.6. ~chneider, Understanding .protocols for Byzantine clock synchronization, Report No. 87-859, Department of Computer Science, Cornell University, Ithica, NY, August 1987.

[5] P.M. Thambidurai, A.M. Finn, R M . Kieckhafer, & C.J. Walter, Clock synchronization in MAFT, Pmc. of the 19th Fault-Tolemnt Computing Symp., Chicago, IL, June 1989, 112-151.

j6J D. Dolev, N.A. Lynch, S.S. Pinter, E.W. Stark, 61 W.E. Weihl, Reaching approximate agreement in the presence of faults, Pmc. of the 3rd Symp. on Reliability in Distributed Software and Database Systems, Clearwater Beach, FL, October 1983, 145-151.

. .

[7] R.M. Kieckhafer & M.H. Azadmanesh, Reaching approximate agreement with mixed mode faults, IEEE Tmnsactions o n Pamllel Distributed Systems, 1 , 1994, 53-63.

181 M.H. Azadmanesh & R.M. Kkkhafer, New hybrid fault models for asynchronous approximate agreement, IEEE Tbansactions on Computers, 45(4), 1996, 439449.

(91 F.J. Meyer & D.K. Pradhan, Consensus with dual failure modes, Ptoc. of the 17th Fault-Tolemnt Computing Symp., Pittsburgh, PA, July 1987, 48-54.

(101 N. Suri, M. Hugue, & C. Walter, Fault classification and distribution effects on the reliability modeling of large fault- tolerant systems, Pmc. of the 22nd Fault-Tolemnt Computing Symp., Boston, MA, July 1992, 212-220.

[ll] P.M. Thambidurai & Y.K. Park, Interactive consistency with , multiple failure modes, Proc. of the 7th Reliable Distributed

. Systems Symp., Columbus, OH, October 1988, 93-100. 1121 N. Vasanthavada & P. Thambidurai, Design of fault-tolerant

docks with realistic failure assumptions, Proc. of the 18th Fault-Tolerant Computing Symp., Chicago, IL, June 1989, 128-133.

1131 A.D. Fekete, Asymptotically optimal algorithms for approxi- mate agreement, Distributed Computing, 4, 1990, 9-29. M.H. ~zadmanesh & R.M. ~ i & k h a f e r , - ~ x p l b i t i n ~ omis sive faults in synchronous appradmate agreement, IEEE Tmnsactions on Computers, 49(10), 2000, 1031-1042. M.H. Azadmanesh & A.W. Krings. Egocentric voting algo- rithms; IEEE lhnsact ions on Reliability, 46(4), 1997, 494-502. C.M. Krishna, Fault-tolerant .synchronization using phase- locked clocks, Microelectronics and Reliability, 30(2), 1990, 275-287. P. Ramanathan, D.D. Kandlur, & K.G. Shin, Hardware- assisted software dock synchronization for homogeneous distributed systems, IEEE Tmnsactiom on Computers, C-39(4), 1990, 514-524. J.L.W. Kessels; Two designs of a faulttolerant docking system, IEEE hnsac t ions on Computets, C-33(10), 1984, 912-919. C.M. Krishna, K.G. Shin, & R W . Butler, Ensuring fault toler- ance of phase-locked clocks, IEEE %nsactaons on Computers, C-34 (a), 1985, 752-756. P. Ramanathan, K.G. Shin, k R W . Butler, , Fault-tolerant dock synchronization in distributed systems, IEEE Computer, 23, 1990, 3342. A. Avizienis, The N-version approach to fault-tolerant soft- ware, IEEE lhnsact ions on Soflware Engineering, SE-I 1 (12), 1985, 1491-1501. L. Chen & A. Avizienis, N-version, programming: A fault tolerant approach to reliability of software operation, Proe. of the International Symp. on Fault Tolerant Computing, Toulouse, France, 1978, 3-9. K.W. Anderson & D.W. Hall, Sets, sequences, and mappings: The basrc concepts of analysis (New York: Wiley, 1963). C.L. Liu,,Elements of discnte mathematics, 2nd ed. (New York: McGraw-Hill, 1985). D. Dolev, N.A. Lynch. S.S. Pinter, E.W. Stark, & W.E. Weihl; Reaching approximate agreement in the presePce of faults, Journal of the ACM, 33(3), 1986, 499-516.

Biographies

Azad Azadmanesh received his B.Sc. degree in cost accounting from the Institute of Advanced Accounting, and his M.Sc. (1982) and Ph.D. (1993) degrees in computer science from Iowa State Univer- sity and University of Nebraska- Lincoln, respectively. He is currently an Associate Professor in the Department of Computer

Science at the University of Nebraska at mah ha, His

research interests include fault-tolerant and survivable sys- tems, distributed agreement, reliability modelling, and network communication.

Lingjie Zhou received her B.Sc. degree in space physics from Peking University, China, and her M.Sc. degree in physics from the University of California, San Diego. She.. received her M..4. degree in computer science in 1996 from the University of Nebraska at Omaha. She is cur-

rently a Software Engineer at Plantronics, Inc., Santa Cruz, California. Her specialities are in Windows programming and computer multimedia.

Daniel Peak received his P ~ . D . degree in business computer infor- mation systems in 1994 from the University of North Texas (UNT), with majors in business com- puter information systems and in finance. He helped in the develop ment of the University of Nebraska at Omaha's College of Informa- tion Science and Technolom. He is -

currently at the UNT College of Business Administration.. His research interests include IT governance, strategic IT planning, media systems, and human factors. Dr. Peak has more than 20 years of IT consulting and planning experience with numerous Fortune 500 companies, and has won had participated in numerous production and research grants.