Trusted Grid Computing with Security Binding and Trust Integration

21
Journal of Grid Computing (2005) © Springer 2005 DOI: 10.1007/s10723-005-5465-x Trusted Grid Computing with Security Binding and Trust Integration Shanshan Song, Kai Hwang and Yu-Kwong Kwok Internet and Grid Computing Laboratory, University of Southern California, EEB-212, 3740 McClintock Avenue, Los Angeles, CA 90089-2562, USA E-mail: {shanshan.song, kaihwang, yukwong}@usc.edu Key words: computational Grids, fuzzy logic, NAS and PSA benchmarks, performance evaluation, resource allocation, scalability analysis, trust models Abstract Trusted Grid computing demands robust resource allocation with security assurance at all resource sites. Large- scale Grid applications are being hindered by lack of security assurance from remote resource sites. We developed a security-binding scheme through site reputation assessment and trust integration across Grid sites. We do not treat the trust factor deterministically. Instead, we apply fuzzy theory to handle the fuzziness or uncertainties behind all trust attributes. The binding is achieved by periodic exchange of site security information and matchmaking to satisfy user job demands. PKI-based trust model supports Grids in multi-site authentication and single sign-on operations. However, cross certificates are inadequate to assess local security conditions at Grid sites. We propose a new fuzzy-logic trust model for distributed trust aggregation through fuzzification and integration of security attributes. We introduce the trust index of a Grid site, which is determined by site reputation from its track record and self-defense capability attributed to the risk conditions and hardware and software defenses deployed at a Grid site. A Secure Grid Outsourcing (SeGO) system is designed for secure scheduling a large number of autonomous and indivisible jobs to Grid sites. Significant performance gains are observed after trust aggregation, which is evaluated by running scalable NAS and PSA workloads over simulated Grids. Our security-binding scheme scales well with increasing user jobs and Grid sites. The new scheme can guide the security upgrade of Grid sites and predict the Grid performance of large workloads under risky conditions. Nomenclature: S i – the i th Grid site among m sites; J i – the j th job in a stream of n user jobs; m – number of Grid sites; n – number of jobs submitted to the system; t ij – trust index of site S i assessed by site S j ; V i – trust vector maintained at site S i ; M – trust matrix for the entire Grid (dimension m × m); – site reputation or job success rate at a site; – self-defense capability of a Grid site; SD – security demand of a user job; TI – site trust index for secure job mapping, same as t ij . The research work reported here was supported by a NSF ITR Grant 0325409. The paper is significantly extended from prelimi- nary results presented in IFIP International Conference on Network and Parallel Computing (NPC-2004), IEEE International Parallel and Distributed Processing Symposium (IPDPS-2005), and Inter- national Workshop on Grid Security and Resource Management (GSRM-2005). The corresponding author is Kai Hwang at the University of Southern California. 1. Introduction Computational Grids are motiviated by the desire to share processing resources among many organi- zations to solve large-scale problems [3, 9]. Very often, a Grid is used for executing a large num- ber of jobs at dispersed resource sites. Each site executes not only local jobs but also jobs submit- ted from remote sites. Thus, job outsourcing be-

Transcript of Trusted Grid Computing with Security Binding and Trust Integration

Journal of Grid Computing (2005) © Springer 2005DOI: 10.1007/s10723-005-5465-x

Trusted Grid Computing with Security Binding and Trust Integration ∗

Shanshan Song, Kai Hwang and Yu-Kwong KwokInternet and Grid Computing Laboratory, University of Southern California, EEB-212, 3740 McClintock Avenue,Los Angeles, CA 90089-2562, USAE-mail: {shanshan.song, kaihwang, yukwong}@usc.edu

Key words: computational Grids, fuzzy logic, NAS and PSA benchmarks, performance evaluation, resourceallocation, scalability analysis, trust models

Abstract

Trusted Grid computing demands robust resource allocation with security assurance at all resource sites. Large-scale Grid applications are being hindered by lack of security assurance from remote resource sites. We developeda security-binding scheme through site reputation assessment and trust integration across Grid sites. We do not treatthe trust factor deterministically. Instead, we apply fuzzy theory to handle the fuzziness or uncertainties behind alltrust attributes. The binding is achieved by periodic exchange of site security information and matchmaking tosatisfy user job demands.

PKI-based trust model supports Grids in multi-site authentication and single sign-on operations. However,cross certificates are inadequate to assess local security conditions at Grid sites. We propose a new fuzzy-logic trustmodel for distributed trust aggregation through fuzzification and integration of security attributes. We introduce thetrust index of a Grid site, which is determined by site reputation from its track record and self-defense capabilityattributed to the risk conditions and hardware and software defenses deployed at a Grid site.

A Secure Grid Outsourcing (SeGO) system is designed for secure scheduling a large number of autonomousand indivisible jobs to Grid sites. Significant performance gains are observed after trust aggregation, which isevaluated by running scalable NAS and PSA workloads over simulated Grids. Our security-binding scheme scaleswell with increasing user jobs and Grid sites. The new scheme can guide the security upgrade of Grid sites andpredict the Grid performance of large workloads under risky conditions.

Nomenclature: Si – the ith Grid site among m sites; Ji – the j th job in a stream of n user jobs; m – number of Gridsites; n – number of jobs submitted to the system; tij – trust index of site Si assessed by site Sj ; Vi – trust vectormaintained at site Si ; M – trust matrix for the entire Grid (dimension m × m); � – site reputation or job successrate at a site; � – self-defense capability of a Grid site; SD – security demand of a user job; TI – site trust index forsecure job mapping, same as tij .

∗ The research work reported here was supported by a NSF ITRGrant 0325409. The paper is significantly extended from prelimi-nary results presented in IFIP International Conference on Networkand Parallel Computing (NPC-2004), IEEE International Paralleland Distributed Processing Symposium (IPDPS-2005), and Inter-national Workshop on Grid Security and Resource Management(GSRM-2005). The corresponding author is Kai Hwang at theUniversity of Southern California.

1. Introduction

Computational Grids are motiviated by the desireto share processing resources among many organi-zations to solve large-scale problems [3, 9]. Veryoften, a Grid is used for executing a large num-ber of jobs at dispersed resource sites. Each siteexecutes not only local jobs but also jobs submit-ted from remote sites. Thus, job outsourcing be-

comes a major trend in Grid computing. However,job outsourcing faces the problems of inevitable se-curity threats and doubtful trustworthiness of remoteresources [13]. Indeed, Grid sites may exhibit unac-ceptable security conditions and system vulnerabili-ties [17, 18].

1.1. Matchmaking of Site Trust Index and JobSecurity Demand

In mapping autonomous and indivisible user jobs ontoGrid sites, we need to tackle a completely new di-mension of trust-related problems. As illustrated inFigure 1, the parallel job mapping problem is definedas a function, an ‘onto’ mapping for dispatching eachjob to a unique resource site. Notice that one or morejobs can be mapped to the same single site. But no jobis mapped to more than one site at the same time. Onone hand, a user job demands resource site to providesecurity assurance by issuing a security demand (SD).On the other hand, the site needs to reveal its trustwor-thiness, called trust index (TI). These two parametersmust satisfy a security-assurance condition: TI � SDduring the job mapping process. The process of match-ing TI with SD is similar to the real-life scenario wherethe Yahoo! portal requires users to specify the securitylevel of the login session [33].

Grid computing demands cooperation among theparticipating resource sites. Such cooperation is in-duced by collaborative incentives to execute largeapplications collectively [4]. The concept of “trust” inGrids is fundamentally different from that used in P2P(peer-to-peer) file sharing networks [11, 15, 23, 38,44]. Specifically, there is a basic level of mutual trustamong Grid sites, while different peers in a P2P net-work may not trust each other at all. That is, the notionof malicious Grid site is not considered in our study.A Grid site’s dynamic security capabilities are varyingin a passive manner due to unexpected asynchronousattacks [21].

A resource site must have certain basic capabil-ities, before it can participate in the collective jobexecution process. These basic capabilities include thesustained computing power, nominal workload level,and average storage capacity in the site. These capa-bilities are most time static, but they can also vary withrespect to time. On the other hand, the security levelof a Grid site must be dynamically changing in nature,because there is no way to predict when and where aGrid will be under attack. Similarly, an application’ssecurity demand is also changing with time. In other

words, both SD and TI are dynamic quantities. Overtime, due to some intermittent security problems, aGrid site may be incapable of executing a job whichwas previously acceptable at the site.

Here, a major challenge is that both the SD andTI are highly loaded concepts. The left callout box inFigure 1 lists typical attributes that the user cares indetermining its security demand. The right callout boxlists the trust factors often used by site managers andpeer machines to assess the trust index of a resourcesite. These attributes and their values are dynami-cally changing and depend heavily on the trust model,security policy, accumulated reputation, self-defensecapability, attack history, and site vulnerability, etc.Thus, an exact matching of SD with TI entails a com-pletely exhaustive one-to-one comparison of all therelated parameters. Such a “deterministic” brute forcematching is obviously a daunting task that is unlikelyto be feasible in practice [33].

1.2. Motivations for Security Assurance in Grids

Indeed, to completely specify a job security demand,we need to use complex vectors of attributes to fullyspecify the requirements involving all of the aforemen-tioned parameters. This is obviously an unreasonableburden on Grid users. Unfortunately, up to now, thereis no effective methodology to assess trust index ofresource sites. A weighted sum of security parametervalues will not work, because it is difficult to determin-istically determine the weights and even the correctset of parameters to be included in real-time. We cansee that the matching of the job security demand withthe site trustworthiness is an important system issue,which was largely ignored by the cyber security com-munity. In this paper, we propose a total solution tothis important problem.

In our study, the job SD is supplied by the userprograms as a single parameter only. The demand mayappear as request for authentication, data encryption,access control, etc. The TI of a resource site is alsolumped into a single parameter, which is aggregatedthrough our novel fuzzy-logic inference process overall related parameters. Specifically, we propose a twolevel fuzzy-logic based trust model to enable the ag-gregation of numerous trust parameters and securityattributes into scalar quantities that are easy-to-use inthe job scheduling and resource mapping process [27].

The TI is normalized as a single real number with 0representing the condition with the highest risk at asite and 1 representing the condition which is totally

Figure 1. Secure mapping of a large number of user jobs onto trusted Grid resource sites, where the job security demand (SD) and the site trustindex (TI) are attributed to many security measures and trust parameters as listed in the callout boxes.

risk-free or fully trusted. The fuzzy inference is doneby four steps: fuzzification, inference, aggregation,and defuzzification [24]. The second salient feature ofour trust model is that if a site’s trust index cannotmatch with job security demand, i.e., SD > TI, ourtrust model could deduce detailed security features toguide the site security upgrade as a result of tuning thefuzzy system [1]. For instance, a Grid site’s TI maybe unknowingly (i.e., not under the Grid site’s will orcontrol) lowered to the level of 0.6 (on a scale of 1.0).However, in view of its static capabilities, the site maybe considered by the scheduler as a suitable candidatefor executing a job with a SD of 0.8. Thus, the problemis that the site has to upgrade its security capability by0.2 in order to match with the requirements.

The whole “match-making” process is based onjust single-valued scalar parameters. This is enabledby the usage of a fuzzy inference system. Indeed,the values 0.6 and 0.8 are the results of aggregatingnumerous detailed security parameters. Similarly, the“deficiency” 0.2 is eventually translated by the fuzzytuning process into detailed specifications as to howthe Grid site can adjust its security features (e.g., fire-wall settings, etc.) in order to ramp up its TI to therequired level. This new approach binds site securitywith job execution for trusted Grid computing [35, 36].

The rest of the paper is organized as follows.Section 2 reviews related work and highlights our con-tributions. We present the basic concept of securitybinding in Section 3. Section 4 introduces trust pa-rameters, site reputation, trust management, and thefuzzy-logic based trust model. Our trust model extendsfrom security awareness to security assurance. Sec-tion 5 describes the trust aggregation architecture and

specifies the trust update, propagation, and integrationprocedures. Section 6 reports Grid simulation exper-iments, workloads, and effects of trust integration.Section 7 presents simulation results on the NAS [12]and PSA [8] benchmarks. We also provide a scalabilityanalysis. Finally, we summarize lessons learned andsuggest further research directions.

2. Related Work and Our New Approach

Our work is inspired by a number of previous worksrelated to trust management and security enhancementfor sustaining Grid performance. These relate worksare reviewed below. Then, we highlight the uniquefeatures in our new trust-integration approach.

2.1. Related Previous Work

A closely related piece of work is by Azzedin and Ma-heswaran [2], who have developed a security-awaremodel between resource providers and the consumers.Lin et al. [26] presented a trust enhanced security so-lution, in which the trust decision could be used forGrid security enhancement. In [32], a decentralizedGrid administration system was proposed using trustmanagement. Novotry et al. [30] developed a creden-tial repository system for assessing the Grid resources.The distinctive feature of our work is that we use afuzzy inference approach to binding security in trustedGrid computing.

Grid security infrastructure (GSI) in GlobusToolkit [39, 41] uses PKI technologies to handleauthentication, single sign-on, and trust delegation.However, it is not capable of assessing local security

condition in a Grid site. The trust model we proposedis aim to assess local security conditions to matchwith dynamically changed job security demands. Weintroduce the trust index of a Grid site, which is de-termined by site reputation and self-defense capabilityattributed to the site track record, risk conditions,hardware and software defenses deployed at a Gridsite.

Our security binding for Grid computing is closelyrelated to reputation systems used in peer-to-peer(P2P) networks [29, 44]. Gupta et al. [15] proposed adecentralized P2P reputation system based on simplecredit–debit operations in updating a peer’s reputa-tion as perceived by other peers. The major issue theytackled is the security in the transmission and updat-ing of the reputation data. In contrast, in our Gridsystem model, we assume Grid sites are cooperativeand possess secure communication channels amongthem [25].

Marti and Garcia-Molina [29] proposed an ef-fective voting based reputation system to facilitatejudicious selection of P2P resources. The major con-cern in such a system is to accurately isolate theadverse effects of a malicious peer. The simple vot-ing scheme was found to be quite effective towardachieving this goal. Similarly, Damiani et al. [11] alsosuggested a novel reputation sharing system to enablereliable usage of remote peer’s resources. Their sys-tem works by using a distributed polling algorithm.Kamvar et al. [23] reported a reputation managementsystem, called EigenTrust, which can effectively re-duce the number of downloads of inauthentic files in aP2P system. The reputation value of each peer is de-termined by the number of successful downloads andthe “opinions” of other peers.

Xiong and Liu [44] recently suggested a P2P rep-utation system called PeerTrust, which maintains acomposite trust value for each peer by integrating fivedifferent factors: feedback by other peers, number oftransactions, credibility of feedback sources, transac-tion context factor, and the community context factor.These parameters are combined using a simple equa-tion to come up with a single metric. Guha et al. [14]studied an interesting idea about the propagation of“distrust”. That is, in addition to maintain the positivetrust values for the peers, the system also allows theproactive dissemination of “bad reputations” of somemalicious peers.

In [18], Hwang and Kesselman pointed out that theGrid environment is inherently unreliable by nature.They provided a failure detection service and a flexible

failure-handling framework as a fault-tolerant mecha-nism on the Grid. They tackle the problem by provid-ing a remedial method – remedy jobs when failure isobserved. We handle the problem by providing pre-cautions to minimize the potential failure in allocatingresources. Our trust model presented here is signifi-cantly extended from our previous work [19, 35, 36],where a single level trust model is presented. Our se-curity infrastructure protects not only metacomputingGrids [9], but also public-resource Grids [2].

Job scheduling has been primarily suggested forsupercomputers, real-time, and parallel computers orheterogeneous systems [28]. Recently, we find reportson adaptive [4, 43], policy-based [22], QoS-driven[16] job scheduling on Grids. Buyya et al. [6] pro-posed a deadline and budget constrained schedulingmodel. A resource co-allocation framework was de-scribed in [10]. Other Grid resource allocation andscheduling work can be found in [27, 34, 40, 42].Our work was also influenced by the work on heuris-tics for PSA job scheduling [8], NAS benchmarkingexperiments [12].

2.2. Our New Approach – the Key Concept

A distinctive feature of our work is that we usea fuzzy inference approach to binding security intrusted Grid computing. Our proposed trust manage-ment and secure job mapping schemes are very dif-ferent from the previous approaches. Our work aimsat providing an optimized matching of security/trustrequirements and judicious Grid site mapping for userjobs. This obviously transcends the maintenance ofreputation values to provide feedback to Grid sites.Our scheme guides the resource sites to reconfig-ure their security facilities to satisfy the job demandsand hence, execute the jobs more successfully un-der security assurance. Our previous results in Gridsecurity were reported in [7, 19, 35, 36]. We ap-proach the problem in several fronts as demonstratedin Figure 2.

The site reputation is an aggregation of four ma-jor attributes listed in Figure 2. These are behaviorattributes accumulated from historical performance ofa resource site. The defense capability of a resourcesite is attributed to many factors. Four major attributesare listed here, which are related to intrusion detection,firewall, and response capabilities. These can be mea-sured as intrusion detection rate, false alarm rate, andintrusion response results [19]. Both site reputationand defense capability are used to determine the trust

Figure 2. The concept of our fuzzy-logic trust model for security-assured Grid computing with all security parameters and site reputationattributes identified and fuzzy trust aggregation at the intra-site and inter-site levels.

index of a resource site. We will study these attributesand their impacts in subsequent sections.

In this paper, we make the following specificcontributions toward trusted Grid computing with se-curity assurance and thus to guarantee the predictedperformance.(1) We propose an effective fuzzy-logic based

security-binding methodology by aggregatingtrust attributes across Grid sites. This schemesystematically combines the status values of dif-ferent security features to come up with a unifiedmetric convenient for mapping large workloadsonto multiple Grid sites operating by differentorganizations.

(2) Our trust aggregation algorithm induces an effec-tive trust enhancement scheme, as indicated bythe feedback path in Figure 2. This scheme inci-sively pin-points the critical security features ofGrid sites that need to be strengthened in order tomatch with the job requirements.

(3) We propose a set of performance metrics to cap-ture the security upgrade overheads and projectthe aggregate Grid performance. We evaluatedthe system under a wide range of system, secu-rity, network, and job parameters. The useful-ness of these performance metrics is evidenced bynumerical results from extensive NAS and PSAbenchmark experiments.

(4) The proposed trust binding scheme is effective inmapping a large number of user jobs onto Grid

resource sites that can finish them at the earli-est times. Our scheme scales well with increasingnumber of jobs and of Grid sites. The scheme canbe also applied to predict the Grid performance,provided all user demands and site information areknown in advance.

3. Security Binding in Computational Grids

In Grid applications, resources and security assur-ance are two basic requirements [4, 41]. Shared Gridresources once infected by malicious codes plantedby intruders may damage other applications runningon the same Grid platform. The security bindingproblem is introduced below for mapping large-scaleapplications onto a large number of Grid sites.

3.1. Mapping Large Workloads onto ComputationalGrids

A typical computational Grid is built with many sharedresources located at various Grid sites. A large work-load for a Grid comprises a large number of indivisibleand independent jobs. These jobs need to rely on dis-tributed parallel processing over a large number ofhost machines dispersed in the Grid sites. Efficientand secure mapping of a large user jobs onto manycomputers at dispersed Grid sites is crucial to achievea perfect match between site TI and user job SD.

The security problem must be solved to assure theexpected Grid performance. The security functional-ities are distributed to local hosts monitored by theLAN gateway at each Grid site. Each gateway acts asa security manager overlooking all resources under itsjurisdiction. All site gateways will work collectivelyto launch countermeasure against the attacks. In thesequel, we will consider the self-defense capability ofeach site as an important parameter to assess the riskcondition of the Grid.

Trusted Grid computing demands secure resourceallocation, sharing, and job scheduling. Secure com-munications are also desired among Grid sites. In-trusion detection and attack pushback are in greatdemand. Fortification of Grid sites is also useful, withfirewalls, packet filters, VPN gateways, traffic mon-itors, etc. Other security features desired in a Gridsystem include the self-defense toolkits, trust manage-ment, risk assessment system, and automated intrusionresponse systems. Very often, we need to have a dis-tributed security infrastructure in a Grid environment.This system must enforce a central security policy in adistributed manner.

3.2. Trust Integration over Grid SecurityInfrastructure

Secure outsourcing of jobs demands user anonymity,confidentiality, data integrity, fine-grain access con-trol, and security policy management [5]. Trust man-agement for protecting Grid sites and support of securejob execution are the major issues. We evaluate the site

self-defense capability and offer a fuzzy logic basedapproach to establishing mutual trust among the Gridsites. The Grid service providers must assure the userswith guaranteed security and dependable accessibilityof all Grid-enabling platforms [41].

Figure 3 illustrates that the trust integration schemacan be implemented by using security overlays builtwith distributed hash tables (DHTs) [37]. Other alter-native is to use encrypted VPN tunnels for the samepurpose. We explain the security assurance steps us-ing an illustrative example Grid consisting of fourresource sites {S1, S2, S3, S4} as shown in Figure 3.The user submits a request to the SeGO agent runningon the SeGO server at site S4. Two-way authenticationis done first between user application and the SeGOagent. The SeGO agents work collectively to generatethe secure resource allocation solution. A DHT basedoverlay network is suggested for security informationexchange and alert distribution in the drawing.

Each SeGO agent contains a resource manager anda trust manager. The resource manager maintains re-source status and monitors job execution. The trustmanager assesses site’s trust index through fuzzy in-ference system. In this architecture, each resource sitemaintains its own trust vector, which is updated peri-odically. Trust vectors are defined in the next section.The trust information is exchanged between the Gridsites in the security binding process. The DHTs offer afast hashing protocol to exchange critical informationin the trust integration process.

Figure 3. Distributed trust integration over security infrastructure across Grid sites.

3.3. Security Attributes, Trust Vector, and TrustMatrix

Security holes may appear as OS blind spots, soft-ware bugs, privacy traps, and hardware weakness inresource sites. These vulnerability factors may weakenthe self-defense capability of a resource site. We showa few representative security attributes and their evalu-ation criteria in Table 1. All attributes can be quantifiednumerically or by Boolean values: true or false.

In our trust model, each site maintains a trust vec-tor. The whole Grid is described by a trust matrix.The trust vector maintained at site Sj is denoted byVj = (t1j , t2j , . . . , tmj )

T. The tij for 1 � i, j � m

is also known as TI, which represents the aggregatedegree of trust of site Si by site Sj . The tij is a realnumber with 0 representing the most risky conditionand 1 representing the highest security assurance. Thetrust matrix is defined by an m × m square matrixM = (V1, V2, . . . , Vm). Figure 4 shows an example oftrust vectors and trust matrix maintained at three Gridsites. This example illustrates various concepts of trustintegration in subsequent sections.

Table 1. Sample security attributes and their evaluation criteria.

Security attributes Evaluation criteria

IDS related capability Traffic audit data size,

signature file size,

signature update frequency

Anti-virus capability Memory scan frequency

Firewall capability Number of firewall rules

Usage of secure network connections TLS and/or IPSec

Provision of execution sandbox Isolated JVM

Invoking dynamic check-pointing True or false

4. Fuzzy-Logic Based Trust Model for GridComputing

Our goal is to build a stable, reliable, and trustwor-thy Grid computing environment. We propose a newfuzzy-logic based hierarchical trust model to achievethis goal. Two basic assumptions are made below:(a) all resource sites have prior agreements to partici-pate in the Grid operations; and (b) the Grid sites trulyreport their site configuration, computing power, andsecurity conditions to each other. In other words, allsites are assumed cooperative in security upgrade andjob scheduling. Selfish Grids [25] are not within thescope of this study. A Grid site can be down due toregular system maintenance or enters a degraded modein response to threats.

4.1. A Hierarchical Trust Model

We propose a hierarchical trust model, specificallydesigned for distributed security enforcement in com-putational Grids. This model applies two levels oftrust inference: the lower level fuzzy inference sys-tem collects all input parameters from a single site,thus called intra-site level. The output of the intra-sitelevel provides the inputs to the upper level. The upperlevel collects inputs from all resource sites, thus calledinter-site level. This two level trust model is depictedin Figure 5.

There are two fuzzy inference systems applied inthe intra-site level. One evaluates the self-defense ca-pability and the other one evaluates the site reputation.Each site reports its assessed self-defense capability toall other sites. We assume a truthful reporting policyby which all sites reports their self-defense capabili-ties, honestly and truthfully. There is only one fuzzyinference system at the inter-site level, which collectsinputs from intra-site levels, and infers the site trustindices to form the trust vector for each site.

(a) (b)

Figure 4. An example Grid with three sites modeled by (a) three trust vectors forming a (b) trust matrix.

Figure 5. Two levels of trust management over the Grid sites, where SAi stands for the ith security attribute, and BAi stands for the ith behaviorattribute.

Figure 6. Site S1 infers the trust index of S3 with initial condition from Figure 4.

The mutual trust tj i between site Si and site Sj

is assessed through four steps: (1) Sj assess its self-defense capability �j through its intra-site defensecapability inference system; (2) Sj reports its self-defense capability �j to Si ; (3) Si assesses Sj ’sreputation �j through its intra-site reputation infer-ence system; and (4) Si assesses Sj ’s trust index �j

through its inter-site inference system. Figure 6 showsan example of how site S1 infers the trust index ofsite S3 in four steps.

4.2. Fuzzy Logic Inference to Establish Inter-SiteTrust

Fuzzy logic is capable of inferring imprecise data oruncertainty associated with the trust index of a re-source site. In Fuzzy theory, the membership functionµ(x) for a fuzzy variable x specifies the degree of anelement belonging to a fuzzy set. It maps element x

into the range [0, 1] with 1 for full membership and0 for no membership. Figure 7(a) shows a “high”membership function for modeling the trust index.Figure 7(b) shows five levels of the trust membershipfunctions. Figure 7(c) shows the variations of trust

index with respect to the site reputation and the self-defense capability, which is the output of inter-sitefuzzy inference system.

A standard fuzzy inference process consists of fivesteps. The inter-site fuzzy inference process using fivesteps is summarized in Algorithm 1.

Algorithm 1: Inter-site fuzzy inference procedure(1) Calculate site reputation �, and obtain the re-

ported self-defense capability �;(2) Use � and �’s membership functions to generate

the membership degrees for � and �;(3) Apply the fuzzy rule set, map the � − � space

to TI space through fuzzy ‘AND’, ‘OR’ and ‘IM-PLY’ operations;

(4) Aggregate the fuzzy outputs from all rules;(5) Derive the trust index’s numerical value through a

defuzzification process.

Figure 8 shows the trust index inference processusing the membership functions in Figure 7. We con-sider initial values: � = 0.84 and � = 0.26,obtained from intra-site inference systems. For sim-plicity, only two simple fuzzy inference rules areapplied in Figure 8.

Figure 7. Membership functions of the trust index. The contour surface shows the variation of the trust index with respect to site reputation �

and the self-defense capability �.

Figure 8. Fuzzy logic inference over the job success rate � and self-defense capability � to aggregate the trust index of a resource site.

Rule 1: If � is very high and � is medium,

then TI is high.

Rule 2: If � is high and � is low,

then TI is medium.

All selected rules are inferred in parallel. Initially,the membership is determined by assessing all termsin the premise. The fuzzy operator ‘AND’ is appliedto determine the support degree of the rules. The “AG-GREGATE” superimposes two AND results curves.The final trust index TI = 0.6 is generated by defuzzi-fying the aggregation. There are many other fuzzyinference rules that can be designed using variouscombination of the fuzzy variables considered.

Given the numerical values of those security andbehavior attributes in Figure 2, we need to define themembership functions and develop suitable inferencerules at each level. We use the fuzzy rule extractionmethod developed by Abe and Lan [1] to derive rules

from numerical data. We then use those fuzzy rules tobuild our fuzzy trust system.

4.3. Fuzzy System Calibration and Site SecurityUpgrade

The purpose of tuning the fuzzy system is to satisfy thesecurity-assurance condition: TI � SD. In our fuzzytrust model, there are two tuning processes:(1) Fuzzy system calibration. This process occurs

only at the initial system-developing phase. Toset up a fuzzy system for a Grid, it may not beaccurate initially due to the lack of accumulateddata. An accurate fuzzy system should be able toinfer the correct site trust indices from security andbehavior information collected. As the environ-ment changes, the fuzzy system needs to updateits configuration setting repeatedly.

(2) Site security attribute tuning. The site securityupgrade process is guided through a top-down

Figure 9. Fuzzy system tuning process to upgrade site trust index.

Figure 10. Security upgrade at site S3 with initial condition from Figure 6.

system tuning the security attributes to yield thetarget trust index. This tuning process has twosteps: inter-site tuning and intra-site tuning, asillustrated in Figure 9.The goal of the inter-site tuning is to upgrade self-

defense capability, to elevate site TI to match with jobSD as specified in Algorithm 2. The inter-site tuningsets the target self-defense capability for the intra-sitetuning to achieve security upgrades at individual sites.

Algorithm 2: Inter-site fuzzy system tuning process(1) target ouput �∗ = average user security demand;(2) observed output � = current site trust index;(3) error = �∗ − �;(4) while (‖error‖ > ε) {(5) Adjust self-defense capability � to �′ quanti-

fied by the fuzzy system;(6) � = Inter-site inference (�,�′);

(7) error = �∗ − �;(8) }(9) Send �′ to intra-site fuzzy system tuning process.

The security upgrade will incur long system over-head to degrade the Grid performance. These over-heads correspond to extra CPU cycles used to run thedefense programs, such as intrusion detection system,traffic datamining engine, and alert correlation moni-tors, etc. One advantage of the fuzzy tuning process is:it could provide multiple alternative methodologies toreach the target defense capability. Thus, a site couldtake one achievable method with minimum overhead.For example, a site is required to upgrade its firewallbut that is beyond the capability of the site. An alterna-tive way is, the site to upgrade its IDS instead, whichcould be accepted by the site. Figure 10 provides asimple example to illustrate security upgrade processwith alternative choices.

5. Trust Integration and Job Mapping withSecurity Assurance

This section presents the trust update and trust inte-gration process. The trust integration algorithms arepresented and illustrated by a running example of a3-site Grid.

5.1. Trust Update and Integration Process

Our trust integration scheme periodically assesses themutual trust and updates the trust indices at all Gridsites. We use a job counting method to determinethe trust update frequency. This method is similar tothe TTL (Time to Live) used in Internet traffic man-agement. After counting sufficient number of jobsexecuted, a site Si will re-assess the reputation of siteSj , and obtained an updated value of the self-defensecapability of site Sj . Let sij be the new trust stimu-lus between sites Si and Sj . This quantity is inferredfrom recent site reputation and self-defense capabilitythrough the inter-site inference system. Equation (1)calculates the new trust index from the old trust indexand current stimulus:

tnewij = αtold

ij + (1 − α)sij . (1)

The weighting factor α is a fraction determined bydesign choices. For security critical applications, thetrust index should change quickly to reflect new sit-uation, thus a small α is adopted such as α < 0.3.But for the stable and relatively low security-sensitiveapplications, a large α is used such as α > 0.9. In gen-eral situations, one can set α in the range of (0.7, 0.8).Algorithm 3 summarizes this trust update process.

Algorithm 3: Trust Update Procedure(1) Site Si assesses site Sj ’s reputation �j ;(2) Site Si obtain the current self-defense capability

�j reported from site Sj ;(3) Site Si calculates the new stimulus value sij =

Inter-site fuzzy inference (�j ,�j );

(4) Site Si calculates the trust index of site Sj : tnewij =

αtoldij + (1 − α)sij .

Figure 11 shows four steps taken by site S1 to up-date the trust index t31 of site S3 in the trust vector V1.To guarantee the fairness, all trust vectors are broad-cast among the sites periodically. Trust propagation istriggered by the changing of trust indices. This trustupdate procedure can fully automated on a periodicbasis.

We use a multicast model to propagate the trustvectors as illustrated in Figure 3. The multicast is se-cured with encrypted tunnels between the Grid sites.With m sites, the contribution from each site is roughly1/m. We specify in Equation (2) a formula to calculatethe new trust vector for site Si by each resource site Sj

for j = 1, 2, . . . , m.

V newj = m − 1

mV old

j + 1

mVi. (2)

The trust propagation and integration process is de-centralized for efficiency and fairness to all resourcesites involved. Peer evaluation enables dynamic se-curity, because the trust update and propagation areperformed periodically.

5.2. Security Upgrade Policies and Job SchedulingHeuristic

Once the TI of a site cannot match the SD of jobsdispatched to the site, security upgrade is needed inthe site. This security upgrade or countermeasure de-ployed could bring extra workload or overhead to thesite. Our study focuses on the computational Grid,the overhead is observed by the extra CPU cyclesconsumed to deploy the countermeasures in order toremove system or network vulnerability. Intuitively,the overhead is proportional to the degree of the se-curity upgrade. The degree is larger for a wider gapbetween the job SD and TI at site.

The upgrade overhead is the extra time con-sumed to enhance the defense capabilities of a re-

Figure 11. Site S1 updates the trust index t31 of S3 from the condition in Figure 6.

source site. This upgrade overhead, denoted by b,is expressed as the percentage of extra time the re-source site spends on self-defense and deploymentof countermeasure, compared with the total aver-age job turnaround time. Three site security upgradepolicies are applied in our trusted Grid simulationexperiments.(A) No upgrade policy. No upgrade is a low-budget

approach by which no security upgrade or trustintegration is performed and thus the currenttrust index is tolerated. The upgrade overheadis denoted by b → ∞. This policy is ap-plied when the Grid sites are operating un-der severely limited budget and resource con-straints.

(B) Full upgrade policy. This is the other extreme,when b → 0. This implies that the sites arefully capable of upgrading their security attributesguided by the fuzzy trust tuning process. Thispolicy reflects the ideal case of no restriction onoperating budget towards upgrading. The localresources are sufficient to upgrade the security.The overhead is considered negligible, becausethe upgrade uses extra resources, which will notaffect the CPU power in regular Grid applica-tions. This ideal case serves as an upper boundfor measuring the performance of the integrationprocess.

(C) Partial upgrade policy. In this policy, the sitemay afford some limited investment to upgradeits trust index. The upgrade is limited by the sitecomputing power and budget constraints. Thus, apartial upgrade offers a compromised solution. Inour experiment, we consider three cases, which ismeasured by b = 5%, 15%, and 30% correspond-ing to the increase in average job turnaround time,respectively.

In our study, we implement the low-overhead,Min–Min heuristic for on-line job scheduling. Specif-ically, for each job, the Grid site that can yield theearliest Expected Time to Completion (ETC) is se-lected. Then the job that has the minimum earliestETC is dispatched to the selected Grid site [28]. TheMin–Min heuristic is implemented as follows: if the TIof a Grid site Si is less than the SD of a job Jj , then itis assumed to take site Si infinite time to finish job Jj .Thus, Si won’t be selected to execute job Jj . In ourdecentralized SeGO model, each Grid site schedulejobs independently. We use the trust vector to guidesecure job mapping.

6. NAS/PSA Benchmarks and Effects of TrustIntegration

In this section, we first describe the experimental setupand introduce the workloads used. We then report theeffects of trust integration under various job-mappingpolicies.

6.1. Simulation Setup and Performance Metrics

The performance of the SeGO system is studied us-ing a discrete event-driven simulator. A simulateddistributed Grid computing system consisting of m

sites is constructed. To model the heterogeneity ofGrid sites, each site has different processing speedand initial self-defense capability. Specifically, in oursimulations, the initial self-defense capabilities of theGrid sites are modeled by a uniform distribution, withvarious mean values, as shown in Table 2.

We consider a realistic distributed Grid comput-ing platform where each site has its own set of users.Accordingly, we model a distributed job submissionsystem – each site maintains its own queue of locallysubmitted jobs. Once a job is submitted, it could beoutsourced to other sites for execution, as governed bythe decisions of the scheduler. Job arrivals are mod-eled by a Poisson distribution. The Min–Min heuristicunder three policies (i.e., no upgrade, full upgrade, andpartial upgrade) is used for job mapping. If a job failson one site, it will be resubmitted for execution on thesame site again.

To evaluate the trust integration performance, weuse the following metrics [20]:• Makespan: the total running time of all simulated

jobs;• Grid utilization: defined by the percentage of

processing power allocated to user jobs out of totalprocessing power available over all Grid sites;

• Failure rate: the percentage of jobs failed andresubmitted in the system;

• Average turnaround time: the average time spentby a job in the system;

• Trust index: the trustworthiness of a site, which isassessed by the fuzzy trust model.

6.2. Workloads and Parameter Settings

In order to gain more practical insights into the effec-tiveness of the trust integration, we use two types ofrealistic workloads.

Table 2. Simulation parameters and settings used in experiments.

Parameter Value

Number of jobs NAS: 16000; PSA: 1000jobs/site

Number of sites NAS: 12; PSA: 10, 15, 20(default), 30, 50

Job arrival rate NAS: given by trace; PSA: 0.0003jobs/second/site

Job workloads NAS: given by trace; PSA: 20 levels (0–300000)

Site processing speed NAS: 8×8 nodes and 4×16 nodes; PSA: 10 levels (0–10)

Mean job security demands 0.45, 0.55, 0.65, 0.75 (default), 0.85

Initial site trust index 0.3–1.0, uniform distribution

NAS trace Workload. We use three months account-ing records for the 128-node iPSC/860 in the Nu-merical Aerodynamic Simulation (NAS) at NASAAmes Research Center. This trace contains 92 daysdata, gathered in 1993. For testing the performanceof the heuristics under a high-throughput Grid en-vironment, the 92 days trace data is proportionallysqueezed to 46 days. We map the 128 nodes to 12Grid sites – four of the sites each contain 16 nodes,and the other eight sites each contain 8 nodes. Oursimulations are based on the arrival time, job size,and runtime data provided by the trace. Detailedinformation about the characteristics of the tracecan be found in [12].

The PSA Workload. The parameter-sweep applica-tion (PSA) model has emerged as a “killer applica-tion model” for composing high-throughput com-puting applications on global Grids. The PSA isdefined as a set of independent sequential jobs [8].The jobs operate on different datasets. A rangeof scenarios and parameters to be explored areapplied to the program input values to generate dif-ferent data sets. The execution model essentiallyinvolves processing n jobs (each with the sametask specification, but a different dataset) on m

distributed sites where n is, typically, much largerthan m. Table 2 lists the key simulation parame-ters. We report our results for various parameters.For each parameter, the default value and theirranges are provided. The default values are usedunless otherwise specified.

6.3. Performance Effects of Trust Integration

Two fuzzy tuning processes are described in Sec-tion 4.5. In our simulations, an accurate fuzzy systemis assumed in our simulated Grid environment. Thus,

we only study the effects of the tuning process forsite security upgrade. In this section, we show thesimulated results of trust integration and site secu-rity upgrade, guided by the site security attributetuning. All numerical results are collected from ex-periments running NAS workload. Figure 12 showsthe variation of trust index against trust integrationsteps.

Figure 12(a) shows individual site trust index forthree typical sites (S2 has low initial trust value, S9has medium initial trust value, and S12 has a highinitial trust value). For all three sites, trust indicesincrease as integration is performed. Trust index in-creases rapidly for sites with a low initial trust value,while it increases slowly for those with a high initialvalue. Figure 12(b) shows the average trust index forthree policies.

A flat trust index is observed under the no upgradepolicy. Full upgrade policy achieves a moderatelyhigher trust value compared with the partial upgradepolicy. For the partial upgrade policy, lower trust indexis observed if more overheads involved. Figure 12(c)shows the average trust indices for various SDs. Ascan be seen, trust index increases swiftly for the caseswith a high security demand.

Figure 13 shows the variation of self-defense capa-bility and site reputation against the trust integrationsteps. Figure 13(a) shows individual self-defense ca-pability for three typical sites. Figure 13(b) shows theself-defense capability for three policies. The trendsobserved from Figures 13(a) and (b) are similar tothose from Figures 12(a) and (b). Figure 13(c) showsthe self-defense capability and site reputation for fulland partial upgrade policies. Here, we can see that forboth full and partial upgrade, the increase of site rep-utation is at a slower pace compared with the increaseof self-defense capability. This can be explicated bythe fact that once a site upgrades its security attributes,a certain delay is needed to affect the site reputation.

(a) (b)

(c)

Figure 12. Variation of trust index at Grid sites during trust integration for the NAS workload: (a) full upgrade over 3 Grid sites; (b) averagetrust index under 3 polices; (c) effects on trust index under various security demands and full upgradeability.

6.4. Effects of Security Upgrade Overheads

Site security upgrade incurs overhead, which willdegrade the Grid performance. Overheads manifestas lower processing power for user applications dueto the need to execute extra security related opera-tions and higher memory usage. Figure 14 shows theperformance degradation in term of the average jobturnaround time. Figure 14 shows the results of partialupgrades for three cases, compared with the results oftwo extreme cases: no upgrade and full upgrade withnegligible overhead.

Four observations are made here: (1) the best per-formance is achieved by full upgrade policy; (2) theturnaround time increases as security demand in-

creases for all policies; (3) for the partial upgradepolicy, the turnaround time increases as b increases;(4) an interesting observation is compared with partialupgrade with b = 30%, no upgrade policy providesbetter performance for low job security demand, forexample, SD < 0.55.

Thus, to match with user security demand, theoverheads of security enhancements must be main-tained low. The lesson we learned here is thatunder the condition of low upgrade overhead in-volved, security upgrade is performed regardless ofuser jobs’ requirement. However, if the upgrade in-volves high overhead, security upgrade is executedonly when user jobs require high security assur-ance.

(a) (b)

(c)

Figure 13. Variation of the self-defense capability (�) and site reputation (�): (a) self-defense capability under full upgrade policy; (b) averageself-defense capability under 3 upgrade policies; (c) average site reputation and self-defense capability under 2 upgrade policies.

Figure 14. Performance degradation quantified by the average jobturnaround time due to security update overhead under the NASworkload.

7. Experimental Results and Scalability Analysis

In this section, we present the system performance re-sults on the NAS workload and PSA workload. Wealso study the scalability effects of the system.

7.1. Performance Results over the NAS Workload

We first test the performance of our trust integrationon NAS workloads. Figure 15 shows the results forvarious job security demands. The first observation isthat the full upgrade policy gives the best performanceon all metrics. Secondly, as job SD increases, allperformance metrics are degraded. While small degra-dation is observed for full and partial upgrade policieswith low overhead (5% and 15% overheads), largedegradation is observed for partial upgrade policy withhigh overhead (30% overhead) and no upgrade policy.These results enforce our belief that security upgrade

(a) (b)

(c)

Figure 15. Performance results on m = 12 sites under NAS workload of n = 16,000 jobs: (a) makespan; (b) Grid utilization; (c) failure rate.

is performed with low overhead, when jobs requirehigh security assurance.

7.2. Performance Results over the PSA Workload

Figure 16 shows the results for the PSA workload withvarious job SDs. Compared with the results for theNAS workload, the relative performance difference ofthe three policies for PSA workload are similar to theresults for NAS workload. The full upgrade policy, asthe ideal case, still has the best performance. Finally,comparing the results for NAS and PSA workloadsthat are quite different in terms of structure, the fuzzyintegration system is quite robust for different types ofworkload.

7.3. Scaling Effects and Scalability Analysis

To test the scalability effects, we use the PSA work-load only because the NAS trace has a fixed numberof jobs. The scaling effects are studied on two dimen-sions: number of jobs (n) and the number of Gridsites (m). Figure 17 shows the results for scalablenumber of jobs. Overall, full upgrade demonstratesthe best performance, followed by the partial upgradewith low overhead, partial upgrade with high overheadand no upgrade policy. In Figure 17(a), a linear in-creasing makespan is observed for all three policies.In Figure 17(b), Grid utilization increases as morejobs are simulated. This is due to the fact that asmore jobs are submitted to the system, more trust in-

(a) (b)

(c)

Figure 16. Performance results on m = 20 sites under PSA workload of n = 20,000 jobs: (a) makespan; (b) Grid utilization; (c) failure rate.

tegration steps are applied, and thus, eventually theGrid environment enters a fully secure status. Next, inFigure 17(c), job failure rate drops as more jobs aresimulated.

Figure 18 shows the results for scalable numberof sites (Grid size). Near flat trends are observed forall performance metrics. These results demonstratethat our trust integration schema is highly scalablefor increasing number of Grid sites. Specifically, asmore sites are involved, decreasing trends are ob-served for job failure rates, as shown in Figure 18(c)for both full upgrade and partial upgrade policies.This is because as more sites are involved in thetrust integration, security upgrade is requested from

more sites, in turn speeding up the trust integrationmechanism.

8. Conclusions and Final Remarks

In this paper, we proposed a new fuzzy-logic basedtrust model for trusted Grid computing with securityassurance. Our trust model aggregates many reputa-tion attributes and measurable self-defense capabiltiesinto scalar quantities, which can be easily applied toquantify the trust index of a Grid resource site. Thefuzzy trust model lays the necessary foundation oftrusted Grid computing. The trust model enables secu-rity binding in mapping large number of user jobs onto

(a) (b)

(c)

Figure 17. Performance results for scalable workload in the PSA experiments on 20 sites: (a) makespan; (b) Grid utilization; (c) failure rate.

multiple Grid sites. The fuzzification and trust integra-tion process can be implemented with encrypted VPNtunnels or fast DHT-based security overlays on top ofWANs or multiple edge networks used to interconnectthe Grid platform.

This security binding scheme is proven effective inmapping large-scale workloads in our comprehensiveNAS and PSA bechmark experiments. New perfor-mance metrics are developed to assess the effects oftrust integration and secure allocation of trusted re-sources to enormous Grid jobs. Our secure bindingscheme scales well with both number of jobs andnumber of sites. Trusted job outsourcing makes itpossible to use open Grid resources with confidenceand running controllable risks. Network threats are

unvoidable, but security reinforcements are alwayspossible. The higher the trust indices raised for re-source sites, the better the expectation of sustanedGrid performance in real-life applications.

For further research, we suggest to extend thework to non-coopertaive Grids in which the prob-lems of Grid sites being selfish and insecure can besolved simultaneously [25]. We will also develop newsecurity-driven heuristics and new genetic algorithmsfor trusted Grid computing [36]. Other exciting ex-tensions will be in developing software toolkits forautomated trust evelauation and automated intrusiondetection and response sustems for protecting Gridresource sites [19]. The GridSec project at USC con-tinues making progress in Internet worm control and

(a) (b)

(c)

Figure 18. Scaling effects of the Grid size from 10 to 50 for PSA workload with 1,000 jobs/site: (a) makespan; (b) Grid utilization; (c) failurerate.

suppression of DDoS flood attacks, all of which aremeant to upgrade Grid performance through securityenhancement.

Acknowledgments

The funding support of this work by the NSF ITRGrant ACI-0325409 is appreciated. We are also in-debted to the USC GridSec team members, Hua Liu,Min Cai, Ying Chen, and Yu Chen for inspiringdiscussions during the courses of this research.

References

1. S. Abe and M. Lan, “Fuzzy Rules Extraction Directly fromNumerical Data for Function Approximation”, IEEE Trans.on SMC, Vol. 25, pp. 119–129, 1995.

2. F. Azzedin and M. Maheswaran, “A Trust Brokering Sys-tem and Its Application to Resource Management in Public-Resource Grids”, in Proceedings of IPDPS 2004.

3. F. Berman, G. Fox and T. Hey (eds.), Grid Computing: Makingthe Global Infrastructure a Reality. Wiley, 2003.

4. F. Berman, R. Wolski, H. Casanova, W. Cirne, H. Dail,M. Faerman, S. Figueira, J. Hayes, G. Obertelli, J. Schopf,G. Shao, S. Smallen, N. Spring, A. Su and D. Zagorod-nov, “Adaptive Computing on the Grid Using AppLeS”, IEEETrans. on Parallel and Distributed Systems, Vol. 14, April2003.

5. A. Butt, S. Adabala, N. Kapadia, R. Figueiredo and J. Fortes,“Fine-Grain Access Control for Securing Shared Resources inComputational Grids”, in Proceedings of IPDPS 2002, April2002.

6. R. Buyya, M. Murshed and D. Abramson, “A Deadline andBudge Constrained Cost-Time Optimization Algorithm forScheduling Task Farming Applications on Global Grids”, inThe Internat. Conf. on Parallel and Distributed ProcessingTechniques and Applications, 2002.

7. M. Cai, Y. Chen, Y.K. Kwok and K. Hwang, “Fast Contain-ment of Internet Worm Outbreaks and Flood Attacks withDistributed-Hashing Security Overlays”, IEEE Security andPrivacy, submitted July 2004 and revised February 2005.

8. H. Casanova, A. Legrand, D. Zagorodnov and F. Berman,“Heuristics for Scheduling Parameter Sweep Applications inGrid Environments”, in Proceedings of HCW 2000.

9. M. Cosnard and A. Merzky, “Meta- and Grid-Computing”,in Proceedings of the 8th International Euro-Par Conference,August 2002, pp. 861–862.

10. K. Czajkowski, I. Foster and C. Kesselman, “Resource Co-Allocation in Computational Grids”, in Proceedings of the 8thIEEE Int’l Symposium on High Performance of DistributedComputing (HPDC-8), 1999.

11. E. Damiani, S. De Capitani di Vimercati, S. Paraboschi,P. Samarati and F. Violante, “A Reputation-Based Approachfor Choosing Reliable Resources in Peer-to-Peer Networks”,in Proceedings of ACM CCS 2002.

12. D.G. Feitelson and B. Nitzberg, “Job Characteristics of aProduction Parallel Scientific Workload on the NASA AmesiPSC/860”, Research report RC 19790 (87657), IBM T.J. Wat-son Research Center, October 1994.

13. I. Foster, C. Kesselman and G. Tsudik, “The Security Ar-chitecture for Open Grid Services”, in The 5th ACM Confer-ence on Computer and Communications Security Conference,1998, pp. 83–92.

14. R. Guha, R. Kumar, P. Raghavan and A. Tomkins, “Propa-gation of Trust and Distrust”, in Proceedings of ACM WWW2004.

15. M. Gupta, P. Judge and M. Ammar, “A Reputation System forPeer-to-Peer Networks”, in Proceedings of ACM NOSSDAV2003.

16. X. He, X.H. Sun and G. Laszewski, “A QoS Guided Schedul-ing Algorithm for the Computational Grid”, in GCC02,Hainan, China, December 2002.

17. M. Humphrey and M.R. Thompson, “Security Implications ofTypical Grid Computing Usage Scenarios”, in Proceedings ofHPDC, August 2001.

18. S. Hwang and C. Kesselman, “A Flexible Framework for FaultTolerance in the Grid”, J. Grid Computing, Vol. 1, No. 3,pp. 251–272, 2003.

19. K. Hwang, Y. Kwok, S. Song, M. Cai, R. Zhou, Yu Chen, YingChen and X. Lou, “GridSec: Trusted Grid Computing withSecurity Binding and Self-Defense against Network Wormsand DDoS Attacks”, in International Workshop on Grid Com-puting Security and Resource Management (GSRM’05), inconjunction with ICCS 2005, Atlanta, May 22–25, 2005.

20. K. Hwang and Z. Xu, Scalable Parallel Computing. McGraw-Hill: San Franscisco, 1998.

21. M. Humphrey, M. Thompson and K. Jackson, “Security forGrids”, Proceedings of the IEEE, Vol. 93, No. 3, pp. 644–652,2005.

22. J. In, P. Avery, R. Cavanaygh and S. Ranka, “Policy-BasedScheduling for Simple Quality of Service in Grid Computing”,in Proceedings of IPDPS 2004, April 2004.

23. S.D. Kamvar, M.T. Schlosser and H. Garcia-Molina, “TheEigentrust Algorithm for Reputation Management in P2PNetworks”, in Proceedings of ACM WWW 2003.

24. B. Kosko, Fuzzy Engineering. Prentice Hall, 1997.

25. Y.-K. Kwok, S. Song and K. Hwang, “Selfish Grid Comput-ing: Game-Theoretic Modeling and NAS Performance Re-sults”, in Proceedings of CCGrid 2005, Cardiff, UK, May2005.

26. C. Lin, V. Varadharajan, Y. Wang and V. Pruthi, “Enhanc-ing Grid Security with Trust Management”, in Proceedingsof Services Computing 2004 (SCC 2004).

27. C. Liu, L. Yang, I. Foster and D. Angulo, “Design andEvaluation of a Resource Selection Framework for Grid Ap-plications”, in Proceedings of HPDC-11, 2002.

28. M. Maheswaran, S. Ali and H.J. Sigel, “Dynamic Mappingand Scheduling of Independent Tasks onto HeterogeneousComputing Systems”, JPDC, pp. 107–131, 1999.

29. S. Marti and H. Garcia-Molina, “Limited Reputation Sharingin P2P Systems”, in Proceedings of ACM EC 2004.

30. J. Novotny, S. Tuecke and V. Welch, “An Online CredentialRepository for the Grid: MyProxy”, in The 10th IEEE Inter-national Symposium on High Performance Distributed Com-puting (HPDC-10’01), San Francisco, CA, August 07–09,2001.

31. R. Perlman, “An Overview of PKI Trust Models”, IEEENetwork, December 1999, pp. 38–43.

32. T.B. Quillinan, B.C. Clayton and S.N. Foley, “GridAd-min: Decentralising Grid administration Using Trust Manage-ment”, in Proceedings of the ISPDC/HeteroPar’04, pp. 184–192.

33. R. Raman, M. Livny and M. Solomon, “Matchmaking: Dis-tributed Resource Management for High Throughput Comput-ing”, in Proceedings of the 7th IEEE International Symposiumon High Performance Distributed Computing, Chicago, IL,July 28–31, 1998.

34. J.M. Schopf, “A General Architecture for Scheduling on theGrid”, Special Issue on Grid Computing, J. Parallel andDistributed Computing, April 2002.

35. S. Song, K. Hwang and M. Macwan, “Fuzzy Trust Integrationfor Security Enforcement in Grid Computing”, in Proceedingsof IFIP International Conf. on Network and Parallel Com-puting, (NPC-2004), Wuhan, China, October 18–20, 2004,pp. 9–21.

36. S. Song, Y.-K. Kwok and K. Hwang, “Security-Driven Heuris-tics and a Fast Genetic Algorithm for Trusted Grid Comput-ing”, in Proceedings of IPDPS 2005, Denver, Colorado, April4–8, 2005.

37. I. Stoica, R. Morris, D. Liben-Nowell, D.R. Karger, M.F.Kaashoek, F. Dabek and H. Balakrishnan, “A ScalablePeer-to-Peer Lookup Protocol for Internet Applications”,IEEE/ACM Trans. on Networking, Vol. 11, No. 1, pp. 17–32,2003.

38. M. Surridge and C. Upstill, “Grid Security: Lessons forPeer-to-Peer Systems”, in Proceedings of the 3rd Interna-tional Conference on Peer-to-Peer Computing (P2P 2003),September 1–3, 2003.

39. S. Tuecke, “Grid Security Infrastructure (GSI) Roadmap”,Internet Draft, October 2000, http://www.gridforum.org/security/ggf1_2001-03/drafts/draft-ggf-gsi-roadmap-02.pdf.

40. S. Vadhiyar and J. Dongarra, “A Metascheduler for theGrid”, in The 11th IEEE International Symposium on HighPerformance Distributed Computing (HPDC’02), Edinburgh,Scotland, July 24–26, 2002.

41. V. Welch, F. Siebenlist, I. Foster, J. Bresnahan, K. Czajkowski,J. Gawor, C. Kesselman, S. Meder, L. Pearlman and S. Tuecke,“Security for Grid Services”, in Proceedings of the HPDC-12,2003.

42. R. Wolski, J. Brevik, J. Plank and T. Bryan, “Grid ResourceAllocation and Control Using Computational Economies”,Chapter 32 in F. Berman, G. Fox and A. Hey (eds.), Grid Com-puting: Making the Global Infrastructure a Reality, Wiley,2003.

43. M. Wu and X. Sun, “A General Self-adaptive Task Schedul-ing System for Non-dedicated Heterogeneous Computing”, inIEEE Int’l Conf. on Cluster Computing, December 2003.

44. L. Xiong and L. Liu, “PeerTrust: Supporting Reputation-basedTrust to P2P E-Communities”, IEEE Trans. Knowledge andData Engineering, July 2004, pp. 843–857.