A Heuristic Algorithm for Task Scheduling Based on Mean Load

11
A Heuristic Algorithm for Task Scheduling Based on Mean Load 1 Lina Ni 1,2 , Jinquan Zhang 1,2 , Chungang Yan 1 , Changjun Jiang 1 1 Department of Computer Science, Tongji University, Shanghai, 200092, China 2 College of Information Science & Engineering, Shandong University of Science & Technology, Qingdao, 266510, China [email protected] Abstract. Efficient task scheduling is critical to achieving high performance on grid computing environment. A heuristic task scheduling algorithm satisfied resources load balancing on grid environment is presented in this paper. The algorithm schedules tasks by employing mean load based on task predictive execution time as heuristic information to obtain an initial scheduling strategy. Then an optimal scheduling strategy is achieved by selecting two machines satisfied condition to change their loads via reassigning their tasks under the heuristic of their mean load. Methods of selecting machines and tasks are given in this paper to increase the throughput of the system and reduce the total waiting time. The performance of the proposed algorithm is evaluated via extensive simulation experiments. Experiment results show that the heuristic algorithm performs significantly to ensure high load balancing and achieve an optimal scheduling strategy almost all the time. Furthermore, results show that our algorithm is high efficient in terms of time complexity. 1. Introduction Efficient task scheduling is critical to achieving high performance in parallel and cluster systems. The purpose of scheduling is to map the tasks onto the processors and order their execution so that minimum makespan is achieved and load balancing across the entire system is satisfied. Since the general DAG scheduling is NP- complete, many algorithms in heterogeneous computing environments have been proposed for the task scheduling problem [1-6]. The Grid is coordinated resource sharing and problem solving in dynamic, multi- institutional virtual organizations. While the Semantic Web is creating a new interconnection environment incorporating the Internet, sensor networks, mobile devices, and the interconnection semantics [7]. This work is supported by projects of National Grand Fundamental Research 973 Program of China under the Grant No.2003CB316902, National Science Foundation of China under the Grant No. 90412013, National 863 Plan (No. 2002AA4Z343, 2002AA1Z2102A), Foundation for University Key Teacher by the Ministry of Education, Shanghai Science & Technology Research Plan(035115029).

Transcript of A Heuristic Algorithm for Task Scheduling Based on Mean Load

A Heuristic Algorithm for Task Scheduling Based on Mean Load1

Lina Ni1,2, Jinquan Zhang1,2, Chungang Yan1, Changjun Jiang1 1 Department of Computer Science, Tongji University, Shanghai, 200092, China

2 College of Information Science & Engineering, Shandong University of Science & Technology, Qingdao, 266510, China

[email protected]

Abstract. Efficient task scheduling is critical to achieving high performance on grid computing environment. A heuristic task scheduling algorithm satisfied resources load balancing on grid environment is presented in this paper. The algorithm schedules tasks by employing mean load based on task predictive execution time as heuristic information to obtain an initial scheduling strategy. Then an optimal scheduling strategy is achieved by selecting two machines satisfied condition to change their loads via reassigning their tasks under the heuristic of their mean load. Methods of selecting machines and tasks are given in this paper to increase the throughput of the system and reduce the total waiting time. The performance of the proposed algorithm is evaluated via extensive simulation experiments. Experiment results show that the heuristic algorithm performs significantly to ensure high load balancing and achieve an optimal scheduling strategy almost all the time. Furthermore, results show that our algorithm is high efficient in terms of time complexity.

1. Introduction

Efficient task scheduling is critical to achieving high performance in parallel and cluster systems. The purpose of scheduling is to map the tasks onto the processors and order their execution so that minimum makespan is achieved and load balancing across the entire system is satisfied. Since the general DAG scheduling is NP-complete, many algorithms in heterogeneous computing environments have been proposed for the task scheduling problem [1-6].

The Grid is coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations. While the Semantic Web is creating a new interconnection environment incorporating the Internet, sensor networks, mobile devices, and the interconnection semantics [7].

This work is supported by projects of National Grand Fundamental Research 973 Program of China under the Grant No.2003CB316902, National Science Foundation of China under the Grant No. 90412013, National 863 Plan (No. 2002AA4Z343, 2002AA1Z2102A), Foundation for University Key Teacher by the Ministry of Education, Shanghai Science & Technology Research Plan(035115029).

The emergence of Semantic Grid puts forward new challenges for task scheduling. Since grid resources have the characteristics of dynamics, autonomy and heterogeneity, it will bring about more difficulties for grid resource management among which task scheduling is an important part. When applied to Grid environments, those above methods often result in poor performance due to the heterogeneity of Grid resources. Grid task scheduling focuses on the task execution time because of the open, dynamic Grid environment [8]. Lots of scheduling policies and algorithms to handle different kinds of tasks have been proposed on heterogeneous environment [9].

O. H. Ibarra et al. [10] present Min-Min and Max-Min task scheduling algorithms of multi-user on heterogeneous environment. Both algorithms consider a hypothetical assignment of tasks to machines, projecting when a machine will become idle based on the hypothetical assignment. The Max-Min algorithm selects the task that will take the maximum time to finish, whereas the Min-Min algorithm selects the task that could finish in the minimum time. Once selected, the task is assigned, the projected machine idle time is updated, and the task is removed from the set of unassigned tasks. K. Taura et al. [11] present a graph-theoretic formulation of task scheduling problem and propose its heuristic algorithm. This algorithm takes as input a task graph and a resource graph and outputs the mapping from tasks to processors. Meanwhile, it takes both parallelism (load balance) and communication volume (locality) into account. However, these algorithms do not take the performance metrics of machines into consideration.

In this paper, we present a heuristic task scheduling algorithm in Grid computing environment based upon the predictive execution time of tasks gained by the predictive models that were presented in [12-17]. Our algorithm obtains a scheduling strategy by employing mean load as heuristic information and then selects both the maximum-load and the minimum-load machines. We reassign tasks between two machines to raise the load of the machine with lower-load and reduce that of the machine with higher-load under the mean load heuristic. In the algorithm, greedy strategy is employed to select some tasks, which can improve the load of lower-load machine as soon as possible to reassign. Therefore, the efficiency of the algorithm is significantly improved and the total performance of the whole system is optimized. We have developed simulator to evaluate the performance of the proposed algorithm and performed some experiments on it. Experiment results show that our algorithm is superior to other algorithms on the load balancing of machines and can almost achieve an optimal scheduling strategy. Experiments also show that our algorithm is high efficient in terms of time complexity.

The paper is organized as follows. In Section 2, the description of the scheduling problem is given. Section 3 describes our heuristic scheduling algorithm in detail. In section 4, we analyze the efficiency of the algorithm. Some simulation experiments are presented in section 5. Finally in section 6, conclusions are drawn and several issues for future works are indicated.

2. Problem Description

The objective of task scheduling is to increase QoS of the system to meet user’s specific request of executing task. One appraisal way of QoS is the system throughput. Generic method attempts to maximize system throughput by keeping all machines “busy” for load balancing. Here, we give an example of task scheduling at first.

We represent the load of a machine as the sum of predictive execution time of the unfinished tasks and the waiting tasks on a machine. Figure 1 gives two different task scheduling strategies which 8 tasks ( 1t , 2t , 3t , 3t , 4t , 5t , 6t , 7t and 8t ) are completed on 3 machines ( 1m , 2m and 3m ), where the predictive execution time of these tasks are

)( 1tα =25, )( 2tα =19, )( 3tα =17, )( 4tα =15, )( 5tα =10, )( 6tα =7, )( 7tα =5, )( 8tα =1, respectively. The first scheduling strategy (in (a)) is that 1t , 6t and 8t are assigned to

1m , 2t and 5t are assigned to 2m , 3t , 4t and 7t are assigned to 3m . While the second scheduling strategy (in (b)) is that 1t , 6t and 8t are assigned to 1m , 2t and 4t are assigned to 2m , 3t , 5t and 7t are assigned to 3m . In the former, the load of the machines is 33, 29 and 37, respectively. While in the latter, that of is 33, 34 and 32, respectively. Obviously, the system throughput in the latter is higher than that in the former. In fact, the latter is an optimal scheduling strategy.

(a) Result of the first scheduling strategy (b) Result of the second scheduling strategy

Figure 1. Two scheduling strategies for 8 tasks on 3 machines.

Now suppose that n tasks nttt ,,, 21 Λ require to execute p machines )(,,, 21 npmmm p <Λ . Let )( itα be the execution time of task it and

ptA ininp /))(( 1α=Σ= be the mean load of n tasks completed on p machines. Let

)()( 1 ijikji tm αα =Σ= be the total time of ik tasks (denoted as iiki tt ,,1 Λ ) , which are

assigned to im in a scheduling strategy. For convenience of depiction, here we use a non-ascending order sequence

)(,),( 1 kpk mm αα Λ (an arrangement of sequence )(,),( 1 pmm αα Λ ) to denote a scheduling strategy in the following narration. In other words, using )(,),( 1 pmm αα Λ to denote a scheduling strategy means that the sequence has been arranged to be a non-ascending order sequence. Now we give the definition of the comparison of the size of two sequences.

Definition 1 Let },,,{ 21 nxxxX Λ= and },,,{ 21 nyyyY Λ= are two real number sequences. X is said to be smaller than Y if },,2,1{,,, nlkji Λ∈∀ , lkji yyxx +=+ , satisfying },max{},max{ lkji yyxx < , denoted as YX < .

Definition 2 Let n tasks nttt ,,, 21 Λ execute on p machines )(,,, 21 npmmm p <Λ . The scheduling )(,),( 1 pmm αα Λ is said to be an optimal scheduling strategy if

)}'(,),'({)}(,),({ 11 pp mmmm αααα Λ<Λ for any scheduling )'(,),'( 1 pmm αα Λ . From definition 2, we can see that an optimal scheduling always expects that each

machine schedule tasks as few as possible, which results in each machine undertake the task "evenly". Now we reach the conclusion that an optimal scheduling is a scheduling which has arrived at load balancing and therefore has high throughput.

If a scheduling satisfies npp Amm === )()( 1 αα Λ , it is an optimal scheduling obviously. Thus, we employ npA as the heuristic information of our task scheduling. We have mentioned before that task scheduling is an NP-Complete problem. Therefore, we can search for a near-optimal scheduling strategy by adding a relaxation factor )1(≥γ and then employing npAγ as the heuristic information. It is easy to see that the scheduling is an optimal scheduling if 1=γ .

Theorem 1 Let n tasks nttt ,,, 21 Λ arranged by non-ascending order of execution time execute on p machines )(,,, 21 npmmm p <Λ . Suppose that 1( ) npt Aα γ> and 1t alone completes on one machine (such as pm ), the optimal scheduling strategy achieved by letting 1−n tasks ntt ,,2 Λ complete on the rest of 1−p machines is an optimal scheduling strategy letting n tasks nttt ,,, 21 Λ complete on p machines.

Proof. The conclusion is obvious. Corollary Let n tasks nttt ,,, 21 Λ arranged by non-ascending order of execution

time execute on p machines )(,,, 21 npmmm p <Λ . Suppose that k (obviously, pk < ) tasks satisfy ),,1)(1/())(()( kiiptt j

niji Λ=+−Σ> = αα and these k tasks complete on

k machines, the optimal scheduling strategy achieved by letting tasks nk tt ,,1 Λ+ complete on the rest of kp − machines is an optimal scheduling strategy letting n tasks nttt ,,, 21 Λ complete on p machines.

We reach the conclusion from theorem 1 and corollary that each task whose execution time is greater than the mean load can be completed on one machine respectively and we can quicken the speed of iteratively finding an optimal scheduling without considering these tasks during iterative procedure. Furthermore, we assume that the execution time of each task is smaller than the mean load.

3. The Scheduling Algorithm

In this section, we present our heuristic task scheduling algorithm dependent on mean load and satisfied resources load balancing on grid environment.

3.1 Overall Structure

We believe that it is very important to select a better initial scheduling strategy to achieve a high efficient scheduling algorithm. Therefore, we start our algorithm by generating an initial scheduling. Our entire algorithm implements following the three stages:

(1) Generate an initial scheduling under the heuristic of mean load, that is, assign tasks to each machine according to the order of the non-ascending task sequence

nttt ,,, 21 Λ . If assigning a task to the current machine such that the load of current machine does not exceed mean load npA , the task is assigned to this machine. If all tasks have been assigned to machines, optimal scheduling strategy has been achieved. Otherwise, assign the remainder tasks to the last machine (algorithm 1) and execute the two optimizing procedures described behind (algorithm 3 and 4).

(2) Choose two machines to exchange tasks under the heuristic of npA and then balance the loads of these two machines (algorithm 3).

(3) Choose two machines and employ their mean load as the heuristic information, balance the loads of these two machines by exchanging them (algorithm 4).

Algorithm 1 TaskSchedule( ) { 1) According to the execution time, sort the tasks sequence nttt ,,, 21 Λ to a non-ascending

order ',,',' 21 nttt Λ ; 2) Compute the mean load of p machines:

ptA ininp /))'(( 1α=Σ= ;

3) 1=i ; while ( npi At >)'(α ) { schedule 'it on pm ;

)1/())'(( −−= ptpAA inpnp α ; 1+= ii ;

1−= pp ; } /* the initial value of )( imα ( pi ,,2,1 Λ= ) is 0 */

4) for pi ,,1Λ= { for each task 'kt on task sequence ',,' ni tt Λ

{ if npki Atm ≤+ )'()( αα , then schedule task 'kt on machine im ; let )'()()( kii tmm ααα += ; }

} 5) if there exists unscheduled task, then {

assign the remainder tasks to machine pm ; IterativeByExchange( ); /* algorithm 3*/ IterativeByAverage( ); /* algorithm 4*/ }

}

3.2 Iterative Improvement Scheduling Algorithm

Now, we have achieved a scheduling strategy of task assigning. However, we assign all the tasks not satisfied the scheduling condition to machine pm in algorithm 1 and this may result in load unbalancing. Therefore, it is necessary to reassign the tasks to improve the system throughput and reduce the total waiting time. We can change the loads of these two machines via reassigning the assigned tasks between them and thus can optimize scheduling strategy. Algorithm 2 gives the method of reassigning tasks between two machines depending on Aver , where Aver is the mean load of all machines or the two machines among which the load of 1m is lower than that of 2m . Furthermore, to improve the efficiency of the algorithm, the value of m and k can be only and alternatively taken as 1 and 2.

Algorithm 2 ReassignTasks( 1m , 2m , Aver ) {

select 1k ( Λ,2,11 =k ) tasks( 11 ,, qkq tt Λ ) from 1m ; let )(1

11 qiki tατ =Σ= ;

select 2k ( Λ,2,1=k ) tasks ( 21 ,, sks tt Λ ) from 2m ; let )(2

12 siki tατ =Σ= , satisfying:

}'')(''|''max{ 12121212 Averm and 1 ≤−+>−=− τταττττττ (*) where 1τ is the total execution time of any 1k tasks assigned to 1m , 2τ is the total execution time of any 2k tasks assigned to 2m .

if condition (*) satisfied, then {

2111 )()( τταα +−= mm , assigned 11 ,, qkq tt Λ to 2m ; 1222 )()( τταα +−= mm , assigned 21 ,, sks tt Λ to 1m ;

} } The strategy of choosing machines to reassign tasks is described as follows: (1) Sort )(,),( 1 pmm αα Λ to a non-ascending order sequence )'(,),'( 1 pmm αα Λ ,

where npi Am >)'(α , npi Am ≤+ )'( 1α . (2) Choose one machine am in the order of )'(,),'( 1 imm αα Λ and choose another

machine bm in the order of )'(,),'( 1+ip mm αα Λ . Reassign tasks between am and bm according to algorithm 2. If the load of bm has not been improved still, select another machine from )'(,),'( 1+ip mm αα Λ or resort )(,),( 1 pmm αα Λ . Repeat this process until there does not exist machine whose load can be improved.

Algorithm 3 IterativeByExchange() { 1) sort )(,),( 1 pmm αα Λ to a non-ascending order sequence )'(,),'( 1 pmm αα Λ ;

1=a ; 2)while ( npa Am >)'(α ) {

pb = ; 3) while ( npb Am <)'(α ) {

4) ReassignTasks( 'bm , 'am , npA ); if the load of 'bm has not been improved, then

1−= bb , otherwise goto 1);

} /* while in 3) */ 1+= aa ;

} /* while in 2) */ } However, because the value of 1k and 2k can be only and alternatively taken as 1

and 2 in algorithm 2, there maybe exist some machines whose loads are greater than npA (of course, maybe there exist some other machines whose loads are smaller than npA at the same time). Thus, similar to algorithm 3, we fetch two machines and

employ their mean load as the heuristic information to exchange the tasks and balance their loads.

Algorithm 4 IterativeByAverage( ) { 1) sort )(,),( 1 pmm αα Λ to a non-ascending order sequence )'(,),'( 1 pmm αα Λ ;

1=a ; 2)while ( npa Am >)'(α ) {

pb = ; 3) while ( npb Am <)'(α ) { 2/))'()'(( baab mmA αα += ; 4) ReassignTasks( 'bm , 'am , npA );

if the load of 'bm has not been improved, then 1−= bb

otherwise goto 1); } /* while in 3) */

1+= aa ; } /* while in 2) */

}

4. Analysis of the efficiency

We assume that n tasks complete on p machines, thus in algorithm 1, the time of 1) is )( 2nO , 2) is )1(O , 3) is )(nO , 4) is )(npO , and 5) is the sum of algorithm 3 and algorithm 4. The difference between algorithm 3 and 4 is the value of parameter Aver . In algorithm 3, Aver is the mean load of all machines while it is the mean load

of the two machines in algorithm 4. Therefore, the time of algorithm 3 is the same as algorithm 4.

In algorithm 2, the time is )( 21kn

mn CCO • , where 1n and 2n are the number of tasks in

machine 1m and 2m , respectively. In general, the values of m and k are 1 and 2, respectively, thus the time of algorithm 2 is ))(( 2

21nnO .

In algorithm 3, if there is no machine whose load can be improved, the algorithm will terminate when 2)’s condition holds. Otherwise, 1) will be executed when there exists machine whose load can be improved. Suppose that there are q machines whose loads can be improved, the time is ))(( 2

2143 nnnqnO , where 3n and 4n is the time of 2) and 3), respectively. The time is lower than )( 4qnO for nnnnn ≤4321 ,,, in algorithm 3.

Therefore, the total time of algorithm 1, 2, 3 and 4 is lower than )( 4qnO . In addition, the load of a machine may be improved many times and must lower than Aver . The following experiment will demonstrate although it is non-polynomial, the

algorithm is efficient.

5. Experiments

We have developed a simulator to evaluate the performance of the proposed scheduling algorithm in the Grid computing environment, which is composed of DAWNING 3000 High Performance Computer, Cluster of 64 nodes and Globus 3.0.

Now, we demonstrate our algorithm by a scheduling strategy of 40 tasks completed on 5 machines. The predictive execution time of these tasks are 40, 58, 17, 51, 24, 59, 32, 91, 15, 41, 8, 12, 40, 92, 6, 39, 94, 52, 39, 93, 8, 27, 88, 21, 3, 18, 6, 8 , 55, 52, 91, 63, 96, 97, 30, 12, 55, 83, 68, 54, respectively. Therefore, it can be calculated that the mean load of every machine is 367.6. Initial scheduling, optimized scheduling strategy by algorithm 3 and optimized scheduling strategy by algorithm 4 are given in Table 1, Table 2 and Table 3, respectively.

Table 1. Initial scheduling result

Machine Load Assigned Tasks m1 367 97, 96, 94, 68, 12 m2 367 93, 92, 91, 91 m3 366 88, 83, 63, 59, 58, 15 m4 366 55,55, 54, 52, 52, 51, 41, 6 m5 372 40, 40, 39, 39, 32, 30, 27, 24, 21, 18, 17, 12, 8, 8, 8, 6, 3

Table 2. First optimized scheduling result

Machine Load Assigned Tasks m1 367 97, 96, 94, 68, 12 m2 367 93, 92, 91, 91 m3 367 88, 83, 59, 58, 40, 24, 15 m4 367 55, 55, 54, 52, 52, 41, 40, 12, 6 m5 370 63, 51, 39, 39, 32, 30, 27, 21, 18, 17, 8, 8, 8, 6, 3

The error of the load of a machine is defined as the difference between the current

load and the mean load of this machine. We give a comparison of the error of our three scheduling strategies on every machine in Figure 2. It can be seen from Figure 2

that the error of the initial scheduling is much higher and the error is reduced after each optimization. It is of great significance that the error of every machine is no more than 1 at most in the second optimal scheduling. In fact, we have obtained an optimal scheduling strategy here.

Table 3. Second optimized scheduling result

Machine Load Assigned Tasks m1 367 97, 96, 94, 68, 12 m2 367 93, 92, 91, 91 m3 368 88, 83, 58, 40, 39, 24, 21, 15 m4 368 55, 54, 52, 52, 41, 40, 39, 17, 12, 6 m5 368 63, 59, 55, 51, 32, 30, 27, 18, 8, 8, 8, 6, 3

Figure 3 presents the comparison among the loads of machines of our algorithm,

Min-Min [10] algorithm, Max-Min [10] algorithm and Kenjiro Taura[11]’s algorithm on every machine. It can be seen from the Figure that the load of every machine is extremely unbalancing in Min-Min and Kenjiro Taura’s algorithm. The main reason is that the minimum-execution-time tasks are assigned to the machine with minimum-load and this leads to the tasks of longer execution time are assigned to the machine with higher load in Min-Min. Since the load of the machine in Kenjiro Taura’s algorithm relates to the initial sequence of tasks, different initial sequence of tasks will lead to different scheduling strategy. Although the result of Max-Min algorithm is very close to ours, the loads of three machines are still very high compared to ours. Therefore, Max-Min does not achieve an optimal scheduling strategy. The load of the machine of our algorithm is balancing and we can obtain an optimal scheduling strategy. Thus our algorithm is superior to others at this point.

-2

-1

0

1

2

3

4

5

m1 m2 m3 m4 m5Machines

Erro

r

Initial scheduling

first optimization

second optimization

310

320

330

340

350

360

370

380

390

400

410

m1 m2 m3 m4 m5

Machines

Load of machine

our algorithmMim-Mim algorithmMax-Min algorithm Taura's algorithm

Figure 2. Comparison of the error of three Figure 3. Comparison of the load of machine

scheduling strategies on every machine of four scheduling algorithms

We call the procedure of exchanging tasks between two machines to obtain a new scheduling strategy an iterative. We have performed simulations on the iterative numbers of algorithm 3 and 4 when a group of tasks completed on different numbers of machines and different numbers of tasks finished on the same number machines. Figure 4 presents the total numbers of iterative of algorithm 3 and 4 when 1100 tasks complete on different machines. Figure 5 gives the numbers of iterative of algorithm 3 and 4 when different tasks complete on 15 machines. The two Figures show that there are no functional relationships between the number of iterative and the number

of machines or that of tasks and the maximum number of iterative is 17 in our experiment. Therefore, our algorithm is of high efficiency in terms of algorithm time complexity.

0

2

4

6

8

10

12

14

6 7 8 9 10 11 12 13 14 15

number of machines

number

of

itera

tive

0

24

68

1012

1416

18

100 300 500 700 900 1100 1300 1500

number of tasks

number

of

itera

tive

Figure 4. Number of iterative of 1100 Figure 5. Number of iterative of our algorithm while tasks scheduled on different machines different number tasks complete on 15 machines

6. Conclusions and Future Works

We have proposed a heuristic task scheduling algorithm under the grid computing environment. The contribution of this paper is that we employ the mean load based on task predictive execution time as heuristic information to obtain an initial scheduling strategy and then reduce the load of the machine with higher-load at the fastest speed according to the proposed method. To improve the efficiency of the algorithm, we only do partial search to balance the load and then further balance the load via reducing constraints. The experiment results show that our algorithm is superior to other algorithms on the load balancing of machines and can achieve an optimal scheduling strategy almost always. Experiments also show that our algorithm is high efficient in terms of time complexity.

However, there is still some room for further investigating in our task scheduling algorithm. First, our proposed algorithm adapts to the tasks submitted by different independent users in our algorithm. However, these tasks may have some logical relations in the actual conditions. We will study the scheduling strategy of tasks with logical relations under more complex conditions in the future. Secondly, since the execution time of the same task completed on different machines will vary in general, we plan to study the scheduling algorithm achieving load balancing under such circumstance.

References

1. Y. Kwok, I. Ahmad: Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs to Multiprocessors. IEEE Transactions on Parallel and Distributed Systems, Vol. 7. (1996) 506-521.

2. E. S. H. Hou, N. Ansari, H. Ren: A Genetic Algorithm for Multiprocessor Scheduling. IEEE Transactions on Parallel and Distributed Systems, Vol. 5. (1994) 113-120.

3. G. C. Sih, E. A. Lee: A Compile-Time Scheduling Heuristic for Interconnection-Constrained Heterogeneous Processor Architectures. IEEE Transactions on Parallel and Distributed Systems, Vol. 4. (1993) 175-186.

4. H. Singh, A. Youssef: Mapping and Scheduling Heterogeneous Task Graphs Using Genetic Algorithms. Proc. of Heterogeneous Computing Workshop, (1996).

5. I. Ahmad, Y. Kwok: A New Approach to Scheduling Parallel Programs Using Task Duplication. Proc. of Int Conference on Parallel Processing, Vol. II. (1994) 47-51.

6. M. A. Palis, J. Liou, D. S. L. Wei: Task Clustering and Scheduling for Distributed Memory Parallel Architectures. IEEE Transactions on Parallel and Distributed Systems, Vol. 7. (1996) 46-55.

7. H. Zhuge, Semantic Grid: Scientific Issues, Infrastructure, and Methodology, Communications of the ACM. 48 (4) (2005)117-119.

8. D. Thain, T. Tannenbaum, M. Livny: Condor and the Grid. in: A.J.G. Hey, F. Berman, G.C. Fox (Eds.), Grid Computering: Making the Global Infrastructure a Reality, Wiley, West Sussex, England (2003) 299-335.

9. K. Krauter, R. Buyya, M. Maheswaran: A taxonomy and survey of Grid resource management systems for distributed computing. Software Pract. Exp. 2 (2002) 135-164.

10. O. H. Ibarra, C. E. Kim: Heuristic algorithms for scheduling independent tasks on non-identical processors. Journal of the Association for Computing Machinery, 24(2) (1977) 280-289.

11. K. Taura, A. Chien: A Heuristic Algorithm for Mapping Communicating Tasks on Heterogeneous Resources. 9th Heterogeneous Computing Workshop, Cancun, Mexico (May, 2000).

12. X.H. Sun, M. Wu: Grid Harvest Service: A System for Long-term, Application-level Task Scheduling. Proc. Of 2003 IEEE International Parallel and Distributed Processing Symposium, Nice, France (April, 2003).

13. Allen Downey: Predicting Queue Times on Space-Sharing Parallel Computers. International Parallel Processing Symposium, (1997).

14. Richard Gibbons: A Historical Application Profiler for Use by Parallel Schedulers. Lecture Notes on Computer Science, (1997) 58-75.

15. Warren Smith, Ian Foster, Valerie Taylor: Predicting Application Run Times Using Historical Information. Lecture Notes in Computer Science, (1998).

16. B. P. Miller, A. Tamches: Fine-grained dynamic instrumentation of commodity operating system kernels. Third Symposium on Operating Systems Design and Implementation (OSDI'99), New Orleans, (February 1999) 117-130.

17. P. Dinda, D. O'Hallaron: An Extensible Toolkit for Resource Prediction In Distributed Systems. Technical Report CMU-CS-99-138, School of Computer Science, Carnegie Mellon University, (July, 1999).