Hard real-time tasks' scheduling considering voltage scaling, precedence and exclusion relations

10
Information Processing Letters 108 (2008) 50–59 Contents lists available at ScienceDirect Information Processing Letters www.elsevier.com/locate/ipl Hard real-time tasks’ scheduling considering voltage scaling, precedence and exclusion relations Eduardo Tavares a,, Paulo Maciel a , Bruno Silva a , Meuse N. Oliveira Jr. b a Universidade Federal de Pernambuco, Centro de Informática, Brazil b Centro de Educação Tecnológica de Pernambuco, CELN, Brazil article info abstract Article history: Received 26 February 2007 Received in revised form 23 February 2008 Available online 3 April 2008 Communicated by A.A. Bertossi Keywords: Real-time systems Scheduling Formal methods Operating systems Several scheduling approaches have been developed to address DVS in time-critical systems, however, overheads, precedence and exclusion relations have been neglected. This paper presents a pre-runtime scheduling method for hard real-time systems considering DVS, overheads as well as inter-task relations. The proposed method adopts a formal model based on time Petri nets in order to find a feasible schedule that satisfies timing and energy constraints. © 2008 Elsevier B.V. All rights reserved. 1. Introduction Over the last years, DVS (Dynamic Voltage Scaling) has been adopted as one of the most effective technique for reducing energy consumption in embedded systems. Ad- justing CPU supply voltage has great impact on energy consumption, since the consumption is proportional to the square of supply voltage in CMOS microprocessors. How- ever, lowering the supply voltage linearly affects CPU max- imum operating frequency, which may interfere in the sys- tem performance. Thus, DVS needs to be adopted with caution in hard real-time systems, since catastrophic issues may occur due to timing constraint violations. Several scheduling approaches, mainly based on run- time techniques, have been developed in order to cope with DVS in hard real-time systems. However, simplified system specifications are generally considered, for instance, neglecting overheads (e.g., voltage/frequency switching and preemption) as well as inter-task relations. In several sys- tems, precedence and exclusion relations (see Section 3) are present, and ignoring such constraints may lead to * Corresponding author. E-mail addresses: [email protected] (E. Tavares), [email protected] (P. Maciel), [email protected] (B. Silva), [email protected] (M.N. Oliveira Jr.). schedules that do not satisfy system requirements prop- erly [1]. This paper presents a time Petri net based approach for hard real-time systems scheduling, considering: (i) DVS, (ii) overheads, and (iii) inter-task relations (e.g., precedence and mutual exclusion). Pre-runtime methods may provide more predictability than runtime counterparts when con- sidering inter-task relations, since, if one feasible sched- ule exists, pre-runtime methods would be able to find it, whereas runtime methods may fail [1]. Besides, a tech- nique for dealing with dynamic slack times is presented in order to take advantage of new opportunities to reduce even more energy consumption during system execution. 2. Related works Yao et al. [3] propose an optimal off-line voltage allo- cation algorithm considering continuously variable voltage, which is unrealistic for many real life cases. In [4], the au- thors developed an optimal off-line voltage allocation ap- proach considering a discrete set of voltages and assuming for single task problem. In spite of their valuable contribu- tions, none of these works deals with inter-task relations or overheads related to voltage/frequency switching. More- over, the work described in [2] extends Yao’s approach in order to consider overheads related to voltage/frequency 0020-0190/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.ipl.2008.03.020

Transcript of Hard real-time tasks' scheduling considering voltage scaling, precedence and exclusion relations

Information Processing Letters 108 (2008) 50–59

Contents lists available at ScienceDirect

Information Processing Letters

www.elsevier.com/locate/ipl

Hard real-time tasks’ scheduling considering voltage scaling, precedenceand exclusion relations

Eduardo Tavares a,∗, Paulo Maciel a, Bruno Silva a, Meuse N. Oliveira Jr. b

a Universidade Federal de Pernambuco, Centro de Informática, Brazilb Centro de Educação Tecnológica de Pernambuco, CELN, Brazil

a r t i c l e i n f o a b s t r a c t

Article history:Received 26 February 2007Received in revised form 23 February 2008Available online 3 April 2008Communicated by A.A. Bertossi

Keywords:Real-time systemsSchedulingFormal methodsOperating systems

Several scheduling approaches have been developed to address DVS in time-criticalsystems, however, overheads, precedence and exclusion relations have been neglected. Thispaper presents a pre-runtime scheduling method for hard real-time systems consideringDVS, overheads as well as inter-task relations. The proposed method adopts a formal modelbased on time Petri nets in order to find a feasible schedule that satisfies timing and energyconstraints.

© 2008 Elsevier B.V. All rights reserved.

1. Introduction

Over the last years, DVS (Dynamic Voltage Scaling) hasbeen adopted as one of the most effective technique forreducing energy consumption in embedded systems. Ad-justing CPU supply voltage has great impact on energyconsumption, since the consumption is proportional to thesquare of supply voltage in CMOS microprocessors. How-ever, lowering the supply voltage linearly affects CPU max-imum operating frequency, which may interfere in the sys-tem performance. Thus, DVS needs to be adopted withcaution in hard real-time systems, since catastrophic issuesmay occur due to timing constraint violations.

Several scheduling approaches, mainly based on run-time techniques, have been developed in order to copewith DVS in hard real-time systems. However, simplifiedsystem specifications are generally considered, for instance,neglecting overheads (e.g., voltage/frequency switching andpreemption) as well as inter-task relations. In several sys-tems, precedence and exclusion relations (see Section 3)are present, and ignoring such constraints may lead to

* Corresponding author.E-mail addresses: [email protected] (E. Tavares), [email protected]

(P. Maciel), [email protected] (B. Silva), [email protected] (M.N. Oliveira Jr.).

0020-0190/$ – see front matter © 2008 Elsevier B.V. All rights reserved.doi:10.1016/j.ipl.2008.03.020

schedules that do not satisfy system requirements prop-erly [1].

This paper presents a time Petri net based approachfor hard real-time systems scheduling, considering: (i) DVS,(ii) overheads, and (iii) inter-task relations (e.g., precedenceand mutual exclusion). Pre-runtime methods may providemore predictability than runtime counterparts when con-sidering inter-task relations, since, if one feasible sched-ule exists, pre-runtime methods would be able to find it,whereas runtime methods may fail [1]. Besides, a tech-nique for dealing with dynamic slack times is presentedin order to take advantage of new opportunities to reduceeven more energy consumption during system execution.

2. Related works

Yao et al. [3] propose an optimal off-line voltage allo-cation algorithm considering continuously variable voltage,which is unrealistic for many real life cases. In [4], the au-thors developed an optimal off-line voltage allocation ap-proach considering a discrete set of voltages and assumingfor single task problem. In spite of their valuable contribu-tions, none of these works deals with inter-task relationsor overheads related to voltage/frequency switching. More-over, the work described in [2] extends Yao’s approach inorder to consider overheads related to voltage/frequency

E. Tavares et al. / Information Processing Letters 108 (2008) 50–59 51

Fig. 1. Methodology.

switching. Nevertheless, dispatcher and preemption over-heads are disregarded. Works, such as [5,6], are based onruntime scheduling policies, which can greatly improve en-ergy consumption as shown by their experimental results.However, those works do not properly tackle overheadsrelated to voltage/frequency switching, and neglect prece-dence and exclusion relations.

In the literature, few works cope with inter-task rela-tions. In [7], the authors consider non-preemptable tasksin order to deal with shared resources. However, such as-sumption restricts the domain of applications that mayadopt the approach. In [8], the authors describe a DVSscheduling method for a distributed environment consid-ering precedence relations, but they ignore mutual exclu-sions and consider overheads in tasks’ worst-case execu-tion time (WCET). [9] proposes a scheduling method tak-ing into account precedence relations and assuming thatall tasks are non-preemptable. Besides, the task model as-sumes tasks with mandatory and optional parts, in thesense that optional parts can be left incomplete in ordernot to violate timing constraints.

3. Preliminaries

This section aims at presenting fundamental conceptsas well as a motivational example in order to show thefeasibility of the proposed approach.

Methodology. The proposed scheduling method is part ofa methodology (see Fig. 1) for automatic code generationof customized embedded software taking account strin-gent timing as well as energy constraints. Firstly, tasks’timing information and hardware energy consumption areobtained. The results acquired in the measurement phaseare adopted in the system specification to provide a basisfor generating a time Petri net model. After generating theinternal model (TPN), the scheduling phase is started inorder to find a feasible schedule that satisfies timing andenergy constraints. Next, the feasible schedule is adoptedas an input to the automatic code generation mechanism,such that a customized code is automatically obtained onlywith the required services in order to minimize overheads.Finally, the application is validated on a DVS-capable plat-form to check system behavior. Once the behavior is cor-rect, the system can be deployed to the real environment.

Specification model. Let T be the set of tasks in a sys-tem. A periodic task is defined by τi = (phi, ri, ci,di, pi),

where phi is the initial phase; ri is the release time; ci isthe worst-case execution cycles (WCEC) required for execu-tion of task τi ; di is the deadline; and pi is the period. Inthis work, sporadic tasks are also considered by translatedthem into equivalent periodic tasks [1]. Tasks may haveprecedence and exclusion relations between them. A taskτi precedes task τ j , if τ j can only start executing after τihas finished. Precedence relations may exist between taskswhen a task requires information produced by other task.A task τi excludes task τ j , if no execution of τ j can startwhile task τi is executing. In other words, task τi cannotbe preempted by task τ j . Exclusion relations may occur be-tween tasks when tasks must avoid concurrent access toshared resources, such as data and I/O devices.

Let V and F be two sets of discrete CPU supply voltagelevels and frequencies, respectively, where |V| = |F |; andvff :V → F (voltage-frequency function) a bijective func-tion that maps each voltage level to one, and only one,processor execution frequency, which is the maximum op-erating frequency in that supply voltage. In this work, volt-age/frequency levels that do not provide energy saving dueto the leakage current are not considered in the schedulingprocess. In addition to the specification above, the systemenergy constraint (emax) needs to be defined, which setsan upper bound in terms of energy consumption that aschedule must not surpass.

Motivational example. Assume the following task set T ={τ1 = (0,0,150 × 106,6,13), τ2 = (0,2,50 × 106,3,13),

τ3 = (0,0,100 × 106,13,13), τ4 = (0,7,60 × 106,9,13)}.In addition to timing constraints, the specification con-tains the following relations: τ1 excludes τ2, τ1 precedesτ3, τ2 excludes τ1, and τ2 precedes τ4. For this example,the DVS platform described in [10] is adopted, which uti-lizes a Phillips LPC2106 processor, a 32-bit microcontrollerwith ARM7 core. More specifically, the CPU supply volt-age/frequency levels adopted for this example are vff ={(1.04 V,20 MHz), (1.07 V,30 MHz), (1.26 V,50 MHz)}.Moreover, considering an average switching capacitance of0.28 nF per clock cycle [11], the energy consumption is0.45 nJ/cycle at 50 MHz, 0.34 nJ/cycle at 30 MHz, and0.31 nJ/cycle at 20 MHz. These values were obtained usingEq. (1) from [5].

Fig. 2(a) shows a schedule obtained by adopting theoptimal off-line DVS algorithm defined by Yao et al. [3],which relies on the optimal runtime scheduling methodEarliest-Deadline First (EDF). The schedule is invalid, sinceτ1 cannot be preempted by τ2. As an alternative, Fig. 2(b)

52 E. Tavares et al. / Information Processing Letters 108 (2008) 50–59

Fig. 2. Schedules generated according to different scheduling methods.

shows a schedule obtained by blocking the execution ofτ2 while τ1 is executing. Again, the schedule is infeasi-ble, since τ2 misses its deadline due to the earlier releaseof τ1. Even discarding DVS, in other words, running everytask at the maximal processor voltage/frequency, a sched-ule could not be found if EDF is adopted (or other runtimescheduling algorithm) (see Fig. 2(c)). On the other hand,Fig. 2(d) depicts a schedule found using the proposed pre-runtime method. All tasks meet their deadlines, and addi-tional energy savings can be obtained comparing to a pre-runtime schedule without DVS (Fig. 2(e)). Fig. 2(d) presentsa schedule that consumes 0.1414 J, while the schedule de-picted in Fig. 2(e) utilizes 0.162 J. One should note that theprocessor needed to be left idle in order to find a validschedule [1]. In this example, the CPU is assumed to havea halt instruction that avoids energy consumption on idlestate.

4. Computational model

Computational model syntax is given by a time Petrinet [12], and its semantics by a timed labeled transitionsystem.

Time Petri net. A time Petri net (TPN) is a bipartite di-rected graph represented by a tuple P = (P , T , F , W ,m0, I),where P (set of places) and T (set of transitions) are non-empty disjoint sets of nodes. The edges are representedby F , where F ⊆ A = (P × T ) ∪ (T × P ). W : A → N repre-sents the weight of the edges, such that W ( f ) = {(i) x ∈ N,

if ( f ∈ F ), or (ii) 0, if ( f /∈ F )}. A TPN marking miis a vector (mi ∈ N

|P |), and m0 is the initial marking.I : T → N × N represents the timing constraints, whereI(t) = [EFT(t), LFT(t)] ∀t ∈ T , EFT(t) � LFT(t). EFT(t) is theEarliest Firing Time, and LFT(t) is the Latest Firing Time.

Considering the previous definition, places (P ) repre-sent local states and transitions (T ) denote local actions.

Fig. 3. Petri net example.

The set of arcs F represents the relationships betweenplaces and transitions, in such a way that arcs connectplaces to transitions and vice-versa. Function W assignsto each arc a natural number, which may be interpretedas the amount of parallel arcs. A marking vector mi as-sociates to each place a natural number, which representsthe number of tokens in the respective place. Graphically,places are represented by circles, transitions are depictedas bars or rectangles, arcs are represented by directed ar-rows labeled with the weight, and tokens (the marking)are generally represented by filled small circles. Fig. 3(a)depicts a Petri net model.

Time Petri net with energy consumption—TPN PE . An ex-tended time Petri net with energy consumption values isrepresented by PE = (P,E). P is the underlying time Petrinet, and E : T → R+ ∪ {0} is a function that assigns transi-tions to energy consumption values.

Enabled transitions. A set of enabled transitions, at mark-ing mi , is denoted by: ET(mi) = {t ∈ T | mi(p j) � W (p j, t),∀p j ∈ P }.

A transition t ∈ T is enabled, if each input place p ∈ Pcontains at least W (p, t) tokens. The time elapsed, sincethe respective transition enabling, is denoted by a clockvector c ∈ (N ∪ {#})|T | , where # represents the unde-fined value for not enabled transitions. As an example,the clock vector for the net in Fig. 3(a) contains one el-ement: c(t1) = 0. At this point, the difference between

E. Tavares et al. / Information Processing Letters 108 (2008) 50–59 53

static and dynamic firing intervals associated with tran-sitions is required. The dynamic firing interval of tran-sition t , I D(t) = [DLB(t),DUB(t)], is dynamically modi-fied whenever the respective clock variable c(t) is incre-mented, and t does not fire. DLB(t) is the Dynamic LowerBound, and DUB(t) is the Dynamic Upper Bound. The dy-namic firing interval is computed in the following way:I D(t) = [DLB(t),DUB(t)], where DLB(t) = max(0, E F T (t) −c(t)), DUB(t) = LFT(t) − c(t). Whenever DLB(t) = 0, t canfire, and, when DUB(t) = 0, t must fire, since strong firingmode is adopted.

States. Let PE be a time Petri net extended with en-ergy consumption values, M ⊆ N

|P| be the set of reach-able markings (e.g., all possible markings) of PE , C ⊆(N ∪ {#})|T | be the set of clock vectors, and E ⊆ R+ ∪ {0}be the set of accumulated energy consumptions. The set ofstates S of PE is given by S ⊆ (M × C × E), that is, a stateis defined by a marking, the respective clock vector, andthe accumulated energy consumption from the initial stateup to this state.

Considering the Petri net model in Fig. 3(a), the initialstate is s0 = (m0 = [1,0], c0 = [0], e0 = 0).

Firable transitions. The set of firable transitions at states ∈ S is defined by: FT(s, emax) = {ti ∈ ET(m) | (e + E(ti) �emax) ∧ (DLB(ti) � min (DUB(tk))), ∀tk ∈ ET(m)}. emax is thesystem energy constraint (Section 3) and e is the accu-mulated energy consumption from the initial state up tostate s.

Firing domain. The firing domain for a transition t atstate s, is defined by the interval: FDs(t) = [DLB(t),min (DUB(tk))],∀tk ∈ ET(m).

Without loss of generality, enabled transitions are onlyrelated to the marking, and firable transitions take into ac-count the marking, their respective clock values (the timeelapsed of each enabled transition), and the system energyconstraint. Considering firing domain, a firable transition tat state s can only fire in the interval denoted by FDs(t).In Fig. 3(a), at the initial state s0 = (m0 = [1,0], c0 = [0],e0 = 0), t1 is firable when c0(t1) = 1 and must fire whenc0(t1) = 3 (FDs0 (t1) = [1,3]), if it neither has been firednor disabled.

TLTS. A timed labeled transition system (TLTS) is a quadru-ple L = (S,Σ,→, s0), where S is a finite set of discretestates, Σ is an alphabet of labels representing actions,→⊆ S × Σ × S is the transition relation, and s0 ∈ S isthe initial state.

The TPN semantics is defined by associating a TLTSLPE = (S,Σ,→, s0), where

(i) S is the set of states of a TPN PE ;(ii) Σ ⊆ (T × N) is a set of labels (t, θ) corresponding to

the transition t firing at time θ in the firing intervalFDs(t), ∀s ∈ S;

(iii) →⊆ S × Σ × S is the state transition relation; and(iv) s0 is the initial state of PE .

Reachable states. Let LPE be a TLTS derived from a TPNPE , and si = (mi, ci, ei) a reachable state. si+1 = fire(si,

(t, θ)) denotes that firing a transition t at time θ from thestate si , the reached state si+1 = (mi+1, ci+1, ei+1) is ob-tained from:

(1) ∀p ∈ P , mi+1(p) = mi(p) − W (p, t) + W (t, p);(2) ei+1 = ei + E(t);(3) ∀t j /∈ ET(mi+1), ci+1(t j) = #;(4) ∀tk ∈ ET(mi+1):

(i) Ci+1(tk) = 0 (if (tk = t)∨ (tk ∈ ET(mi+1)− ET(mi))),or

(ii) Ci+1(tk) = Ci(tk) + θ , otherwise.

For a better understanding, in Fig. 3(a), assume the fir-ing of transition t1 at s0 = (m0 = [1,0], c0 = [0], e0 = 0),θ = 1 and E(t1) = 2.5. State s1 = (m1 = [0,1], c1 = [#],e1 = 2.5) (see Fig. 3(b)) is reached in the following way:

(i) m1(p0) = 1 − 1 + 0, m1(p1) = 0 − 0 + 1;(ii) e1 = 0 + 2.5; and

(iii) c1(t1) = #. The respective TLTS is s0(t1,1)−→ s1.

Feasible firing schedule. Let LPE be a timed labeled tran-sition system of a time Petri net PE , s0 its initial state,sn = (mn, cn, en) a final state, and mn = M F is the desiredfinal marking.

s0(t0,θ0)−→ s1

(t1,θ1)−→ s2 → ·· · → sn−1(tk,θn−1)−→ sn

is defined as a feasible firing schedule, where si+1 = fire(si,

(tk, θi)), i � 0, tk ∈ F T (si, emax), and θi ∈ FDsi (tk).

The system modeling of the proposed methodologyguarantees that the final marking M F (see Section 5) iswell known since it is explicitly specified.

5. Modeling real-time systems

The proposed modeling method adopts a bottom-up ap-proach, in which a set of composition rules are consideredfor combining basic building block models. The set of ba-sic models have been conceived for automatic pre-runtimeschedule generation, where the schedule period (P S ) cor-responds to the least common multiple (LCM) of all tasks’periods. Within this period, several task instances (of thesame task) might be carried out, where N (τi) = P S/pigives the number of instances of each task τi . Once a fea-sible schedule is generated (Section 6), the same schedulewill be infinitely often executed during system execution.

In order to present each building block, consider themodel depicted in Fig. 4, which represents the follow-ing specification: T1 = (0,0,240 × 106,20,20) and T2 =(0,5,60 × 106,15,20). For this specification, the pre-emptive scheduling method is assumed, and the fol-lowing voltage/frequency levels are considered: vff ={(1 V,10 MHz), (2 V,20 MHz)}. Moreover, an unavailable

54 E. Tavares et al. / Information Processing Letters 108 (2008) 50–59

Fig. 4. Example’s model.

voltage/frequency level of 1.5 V/15 MHz is also takeninto account. In this case, the unavailable voltage can be“simulated” using the 2 immediately neighboring CPU volt-ages [4]. The building blocks are explained as follows:

(a) Fork block. Supposing that the system has n tasks, thefork block is responsible for starting all tasks in thesystem. This block models the creation of n concurrenttasks as well as it represents the initial marking.

(b) Periodic task arrival block. This block models the pe-riodic invocation for all task instances in the sched-ule period (P S ). A transition tphi

models the initialphase of the task first instance. Similarly, transition tai

models the periodic arrival (after the initial phase) forthe remaining instances and transition tri representsa task instance release. Note the weight of the arc(tphi

, pwai ), which models the invocation of all remain-ing instances after the first task instance. The timingintervals of transitions tphi

and tai are the timing con-straints depicted in the specification, in this case, phi(phase) and pi (period). Considering transition tri , thetiming interval is [ri,di − Cmin], where ri is the re-lease time, di is the deadline constraint, and Cmin isthe computation time of task τi at the highest volt-age/frequency level.

(c) Voltage selection block. For each available voltage, thisblock represents every possible voltage selection for

executing a task instance. In this block, a voltage levelis represented by a transition tvin

.(d) Non-preemptive task structure block. This block

(Fig. 5(a)) models a non-preemptive task computa-tion adopting a specific voltage. In this block, pro-cessor granting and task computation are representedby transition tgin

and tcin, respectively. Only after the

entire task computation, the processor is released bytransition tcin

. Assuming a voltage V ∈ V and the re-spective maximum frequency f = vff (V ), task com-putation time (C ) can be obtained by C = ci/ f �,where ci is the task (τi) WCEC. Furthermore, com-putation transitions have energy consumption valuesgreater than zero, which are calculated using equation1 from [5].

(e) Preemptive task structure block. In this particularscheduling method, tasks are implicitly split into sub-tasks, in which the computation time of each subtaskis exactly equal to one task time unit (TTU). This ismodeled by the time interval of computation transi-tions (I(tcin

) = [1,1]), and the entire computation ismodeled through the arc weights. Considering C thetask computation time at a specific voltage, C tokensare stored in place pwgin

, and the same amount of to-kens in place pw fin

is needed for firing transition t f vin.

(f) Non-preemptive task structure with 2 voltages blockand

E. Tavares et al. / Information Processing Letters 108 (2008) 50–59 55

Fig. 5. Non-preemptive task structure blocks.

Fig. 6. Overhead blocks.

(g) Preemptive task structure with 2 voltages block. Ifthe CPU provides a small number of discrete voltagelevels and an ideal voltage is not available the twoimmediate neighbor voltages to the ideal one can beadopted for reducing energy consumption [4]. The pro-posed method allows the modeling of a task instanceexecuting at two different voltages considering non-preemptive (Fig. 5(b)) and preemptive executions. C1

represents the computation time of the first part ofthe task executing at V idealH , and C2 represents thecomputation time of the second part of the task ex-ecuting at V idealL . Without loss of generality, theseblocks resemble the task structure blocks presentedpreviously.

(h) Deadline checking block. This checks the occurrenceof a deadline missing through transition tdi , which isenabled at the moment a task instance is ready forexecution. For each place pwcin

and pwc1inin each

task structure block related to task τi , a transition tpcin

is connected as postcondition. Whenever a task in-stance is executing and a deadline missing occurs (e.g.,tdi fires), the token is removed from place pwcin

(orpwc1in

) such that it is not possible to fire any othercomputation transition (e.g., tcin

) in the model. Thetiming interval for transition tdi is the deadline con-straint di ([di ,di]) of task τi .

(i) Join block. The join block states that all tasks in thesystem have concluded their execution in the sched-ule period. In this block, transition t fin

represents theconclusion of a task instance, and, thus, it removes anytoken enabling transition tdi (deadline missing). Afterfiring each transition of every task model, transitiontend is fired, and, hence, a token is stored in place pend(desired final marking—M F ).

(j) Processor block. The processor model consists of asingle place pproc, where a token states the processoravailability.

In addition to the building blocks already described,precedence and exclusion relations between tasks arerepresented considering the model’s structures presentedin [13].

Overhead modeling. Two additional blocks have to be in-troduced in order to deal with runtime overheads. Thefirst model (Fig. 6(a)) is a preemptive task structure blockwith overhead considering a single voltage. Assuming k thenumber of tasks, places pprochT j , 1 � j � k, represent flagsthat indicate the current task T j executing on processorpproch. In a similar manner, place pproch_idle represents theidle state of processor pproch. Overheads are representedby transition toin, in which its timing interval is equal to

56 E. Tavares et al. / Information Processing Letters 108 (2008) 50–59

[a,a], and an associated energy consumption value is as-signed. In this work, the transition overhead contemplates:

(i) dispatcher overhead, including context-switching;(ii) voltage/frequency related to the dispatcher execution;

and(iii) voltage/frequency switching for executing the respec-

tive task.

The proposed approach assumes the dispatcher executionat a fixed supply voltage, which is up to the designer to se-lect the appropriate one. Transitions tgin j , 0 � j � k, i �= j,represent the processor granting and takes into accountoverheads that may occur in task start-up, context-savingor context-restoring. The overhead is not considered whenthe same task is executing without interruption, whichis represented by transition tgini . As in preemptive taskstructure block without overhead, the computation time ismodeled using arc weights. Transition tcin represents theexecution of one task time unit related to the computa-tion time (C ) and notifies the execution of task Ti throughplace pprochTi . After the last computation time unit (tlcin),there is no need for indicating the task execution, and,next, the processor goes to idle state (pproch_idle). Theprocessor pproch is not shown for the sake of readability.However, each processor-granting transition has an incom-ing arc from pproch, and each computation transition hasan outgoing arc to pproch.

Fig. 6(b) shows a preemptive task structure consideringtwo voltage levels. Without loss of generality, this blockmay be interpreted as two concatenated instances of theprevious block, but with a slight difference. The differencelies on the overhead after executing the first part of thetask at the immediate higher voltage. Since only a volt-age/frequency switching is required to execute the secondpart of the task, an additional overhead transition (tov in)and place (pprochT i_2volt) are considered. The timing in-terval [av ,av ] and the energy consumption value associ-ated with this overhead transition are smaller than thoseassigned to transition toin due to the absence of unneces-sary services. Additionally, as a new flag (pprochT i_2volt) isadded, other tasks have to consider, in each preemptivestructure block, a new processor-granting transition thatreceives an incoming arc from pprochT i_2volt. Again, pprochis not shown due to readability issues.

6. Pre-runtime schedule synthesis

This section describes the proposed pre-runtime sched-ule synthesis, detailing about the state space minimizationand the scheduling algorithm.

Minimizing state space size. In the proposed method, theanalysis based on the interleaving of actions is a funda-mental point to be considered when facing state spaceexplosion problem. The analysis of n concurrent actionshas to tackle all n! action’s interleaving possibilities, unlessdependencies between these actions are considered. Thiswork proposes four means for minimizing the state spacesize:

Fig. 7. Schedule synthesis algorithm.

Preprocessing. Before applying the proposed schedulingalgorithm, the specification is preprocessed considering anextension of Yao’s method, in which a set of discrete volt-ages is considered [14]. Yao’s method is adopted as a basisfor resembling CPU’s unavailable voltages by the nearestaccessible voltage levels as well as a guide for selectingan initial voltage/frequency level for each task instance.Yao’s algorithm complexity order is O (N log2 N)—in whichN is the number of tasks’ instances—whereas a schedulingproblem with intertask relations is NP-hard. Experimentshave shown that the preprocessing greatly improves thescheduling generation processing time as well as the statespace size by avoiding inappropriate voltages.

Modeling. The proposed method explicitly models de-pendencies between actions, for instance, resource grant-ing and releasing, precedence and exclusion relations be-tween tasks, etc. Furthermore, the modeling itself mayhelp in minimizing the state space size.

Partial-order. If actions can be executed in any order,such that the model always reaches the same state, theseactions are independent. In other words, it does not mat-ter in which order these actions are executed [15]. In-dependent actions are related to transitions that do notdisable other actions, such as arrival, release, precedence,processor releasing. Thus, this method gives the highestchoice-priority levels to independent activities, and thelowest levels to dependent activities (e.g., processor grant-ing). More specifically, when changing from one state toanother, the highest choice-priority class of transitions isanalyzed whereas the other classes are pruned. As con-sequence, this technique decreases the state space size aswell as allows checking unavailability of feasible schedules.

Removing undesirable states. Section 5 presents a build-ing block able to find out deadline missing, which is anundesirable reachable state. During the TLTS generation,transitions leading to undesirable states are discarded bythe scheduling algorithm.

Pre-runtime scheduling algorithm. The proposed algo-rithm (Fig. 7) is a depth-first search method for TLTS gen-eration that aims achieving the stop criterion (final markingM F reachability—Section 5) without generating the wholestate space. Whenever the stop criterion is achieved, a fea-sible schedule is generated. Considering that (i) the Petrinet model is surely bounded, and (ii) the timing con-

E. Tavares et al. / Information Processing Letters 108 (2008) 50–59 57

Fig. 8. Runtime scheduler example.

straints are enclosed by finite intervals, TLTS is finite andthus the proposed algorithm always finishes, providing asresult either a feasible schedule or none, in case no oneexists.

The only way the algorithm returns TRUE is when itreaches a desired final marking (M F , stop criterion), imply-ing that a feasible schedule was found (line 3). The statespace generation algorithm incorporates the state spacepruning (line 5), where, for the set of firable transitions(function firable), function pruning is executed ac-cording to the rules described in Section 6. PT is a setof ordered pairs 〈t, θ〉 representing, for each firable tran-sition (post-pruning), all possible firing time in the firingdomain. The tagging scheme (lines 4 and 9) ensures that nostate is visited more than once. The function fire (line 8)returns a new generated state (S ′) due to the transition tfiring at time θ . The feasible schedule is represented by atimed labeled transition system that is generated by thefunction add-in-trans-system (line 11). Only whenthe system does not have a feasible schedule, the wholestate space is analyzed.

7. Handling dynamic slack times

During system runtime, slack times (CPU idle times)may appear due to tasks’ early completion. In order totake advantage of such slacks for reducing even moreenergy consumption, a small runtime scheduler is pro-posed for adjusting the starting times as well as the volt-age/frequency levels associated to each task instance.

Before presenting the runtime scheduler algorithm,some concepts are required firstly. The pre-runtime sched-ule is partitioned in several time slices of the same size,where each slice correspond to one task time unit, andthe total amount is equal to the LCM. These slices can begrouped into segments in such a way that represent taskexecutions. Such segments are denominated task segments,and each one is represented by an interval ([start, end]).When a task is not completely executed within a segment,the task is preempted, in other words, it is carried outthrough more segments. Moreover, a global clock (clock)is adopted for tracking the current time (e.g., the accumu-lated number of time slices).

The runtime schedule algorithm is depicted in Fig. 9using a C syntax notation. In order to check the early com-pletion of a task instance, the runtime scheduler is exe-cuted at the end of each segment, in such a way that its

Fig. 9. Runtime scheduler algorithm.

execution does not conflict with the dispatcher execution.Firstly, the scheduler verifies which is the next segment(line 2) in the pre-runtime schedule, since it is the can-didate for adjusting the respective voltage/frequency levelas well as the start time. If there is no segment to beexecuted—the remaining segments are returns from pre-emption of finished instances or the last segment was al-ready executed—the original pre-runtime schedule is kept(line 4). Also, the pre-runtime schedule is not changedwhether the start time of the next segment is equal tothe release time assigned to the respective task instance.Considering that there is an available segment, the respec-tive start time is set so that the release time is not violated(line 6 and 8). If the next segment can be promptly started,the start time is tuned for taking into account the sched-uler WCET (worst-case execution time). It is worth notingthat the adjustment is only performed whenever the im-provements compensate the scheduling overhead (line 9).

For a better understanding, consider the schedule de-picted in Fig. 8(a), which is composed of the followingsegments:

(i) τ1 = [1,2];(ii) τ 1

2 = [2,4];(iii) τ3 = [4,6];(iv) τ 2

2 = [6,9]; and(v) τ4 = [9,10].

The DVS platform described in the motivational example isadopted, considering an additional voltage/frequency levelof 1.38 V/60 MHz and an energy consumption per clockcycle of 0.54 nJ. In this example, if task τ2 completes itsexecution earlier at 7, the proposed scheduler attempts to

58 E. Tavares et al. / Information Processing Letters 108 (2008) 50–59

Table 1Experimental results summary

Inst. Size Sch. Found w/DVS o/DVS Time (s)

1. 4 7 × 107 48 141 0.1414 0.1620 0.0012. 6 7 × 1035 4567 543162 0.0003 0.0008 35.7263. 12 2 × 1032 551 9906 267.8400 360.0000 0.2824. 289 9 × 1070 235852 1884381 0.1190 0.3450 291.2215. 178 3 × 1077 1448 1448 0.0050 0.0210 0.0396. 3604 3 × 1068 381313 381313 3.8620 4.7660 9.606

adjust the voltage/frequency level as well as the start timeof the next segment (τ4). Considering that τ4 release timeis equal to 6, τ4 can start its execution earlier and uti-lize a lower voltage/frequency level (Fig. 8(b)). Assumingthat WCEC of each task are c1 = 50 × 106, c2 = 150 × 106,c3 = 100 × 106, c4 = 60 × 106, the energy consumptionis reduced from 0.1305 J (early completion of task τ2) to0.1167 J (Fig. 8(b)).

8. Experimental results

Table 1 summarizes some experiments conducted inwhich the proposed pre-runtime scheduling algorithm hasbeen applied. In that table, Inst. represents the numberof tasks’ instances; Size depicts a state space size esti-mation [13]; Sch. is the number of states of the feasibleschedule; found counts the number of states actually ver-ified for finding a feasible schedule; w/DVS is the energyconsumed (in joules) by the found feasible schedule usingDVS; o/DVS is the energy consumed in joules by an al-ternative schedule that discards DVS; and Time expressesthe algorithm execution time (in seconds). For all casestudies, the energy model described in [5] (Eq. (1)) hasbeen adopted. Besides, all experiments were performed ona Pentium D 3 GHz, 4 Gb RAM, OS Linux, and compilerGCC 3.3.2. For a better comprehension, the following para-graphs give an overview of each case study.

Case study 1 (motivational example) is based on Exam-ple 2 presented in [1], which demonstrates conditions inwhich pre-runtime approaches can find feasible schedules,and runtime methods may fail.

Case study 2 takes into account overheads related tovoltage/frequency switching as well as the dispatcher ex-ecution. The task set is composed of the following tasks:τ1 = (0,0,147×103,6,26), τ2 = (0,2,47×103,3,26), τ3 =(0,0,976 × 102,13,26), τ4 = (0,7,582 × 102,9,26), τ5 =(0,13,2982×102,26,26), and τ6 = (0,14,97×103,16,26).In addition to timing constraints, the specification con-tains the following inter-task relations: τ1 excludes τ2, τ1precedes τ3, τ2 excludes τ1, τ2 precedes τ4, τ5 excludesτ6, and τ6 excludes τ5. This experiment is based on casestudy 1, which demonstrates a situation where runtime ap-proaches may fail, but pre-runtime methods can providefeasible schedules. For this experiment, the DVS platformdescribed in the motivational example is adopted, consid-ering additional voltage/frequency levels: 1.02 V/10 MHz,1.15 V/40 MHz and 1.38 V/60 MHz. Besides, a dispatcherwas developed for managing tasks’ execution during sys-tem runtime. The dispatcher worst-case execution time is60 microseconds at 60 MHz and the time overhead related

to voltage/frequency switching is 10 microseconds. Table 1presents the results.

Case study 3 is based on Fig. 2 of [16], which also de-picts a condition in which runtime scheduling methodsmay not work. For considering this example, this workadjusted the computational times for allowing voltage scal-ing. The task set is presented as follows: τA0 = (0,0,500 ×103,80,120), τA1 = (0,10,1000 × 103,100,120), τA2 =(0,30,300 × 103,120,120), τB = (0,20,1000 × 103,120,

240), τC = (0,30,1000 × 103,50,120), and τD = (0,90,

1000 × 103,110,240), τE = (0,0,400 × 103,240,240), andτF = (0,0,1000 × 103,240,240). Regarding inter-task re-lations, the reader is referred to [16] for detailed in-formation. Additionally, the processor model adopted inthis experiment is based on [4]. More specifically, thevoltage/frequency levels are vff = {(2 V,20 MHz), (3 V,

30 MHz), (5 V,50 MHz)}, and the respective energy con-sumption is 40 nJ/cycle at 50 MHz, 14.4 nJ/cycle at30 MHz, and 6.4 nJ/cycle at 20 MHz. Results are shownin Table 1.

Case study 4 [17], 5 [18] and 6 [19] are real-world appli-cations. For these experiments, the WCEC of each task havebeen obtained by multiplying the respective WCET withthe maximum operating frequency of the processor modeladopted in each experiment. Case study 4 is the controlsoftware of a CNC machine, which is an automatic machin-ing tool adopted for manufacturing user-designed work-pieces. For this case study, the processor model is basedon [4], and the respective voltage/frequency levels arevff = {(3 V,30 MHz), (4 V,40 MHz), (5 V,50 MHz), (6 V,

60 MHz), (7 V,70 MHz)}. The respective energy consump-tion is 78.4 nJ/cycle at 70 MHz, 57.8 nJ/cycle at 60 MHz,40 nJ/cycle at 50 MHz, 25.6 nJ/cycle at 40 MHz, and14.4 nJ/cycle at 30 MHz. Case study 5 is a pulse oxime-ter, which is an electronic device responsible for measuringthe blood oxygen saturation using a non-invasive method.The specification is composed of several non-preemptabletasks with precedence constraints and the respectivevoltage/frequency levels are vff = {(0.4 V,4 MHz), (0.8 V,

8 MHz), (1.2 V,12 MHz)}. For this processor model, theenergy consumption at each voltage/frequency level is2.304 nJ/cycle at 12 MHz, 1.024 nJ/cycle at 8 MHz, and0.256 nJ/cycle at 4 MHz. Case study 6 is an applicationcomposed of a MP3 player and GSM decoder, in which therespective specification contains several tasks with prece-dence relations. In this case study, the period of each taskhas been considered equal to the respective deadline con-straint. Besides, Intel XScale PXA250 is adopted as the pro-cessor model considering the following voltage/frequencylevels vff = {(0.9357 V,132.7 MHz), (1.1 V,199.1 MHz),(1.21 V,298.7 MHz), (1.43 V,398.2 MHz)}. The respectiveenergy consumption is 0.91 nJ/cycle at 398.2 MHz,0.74 nJ/cycle at 298.7 MHz, 0.66 nJ/cycle at 199.1 MHzand 0.49 nJ/cycle at 132.7 MHz. Table 1 shows the energysavings obtained using the proposed approach.

9. Conclusion

This paper presented an approach based on time Petrinets for hard real-time system scheduling, consideringDVS, overheads and inter-task relations. Predictability is

E. Tavares et al. / Information Processing Letters 108 (2008) 50–59 59

an important concern when considering time-critical sys-tems. In order to guarantee that every critical task meetsits deadline, a pre-runtime scheduling approach was pro-posed. As future work, we are planning to extend theproposed scheduling method in order to consider multipleprocessors.

References

[1] J. Xu, D. Parnas, Priority scheduling versus pre-run-time scheduling,Real-Time Systems 18 (1) (2000) 7–23.

[2] B. Mochocki, X. Hu, Q. Gang, A realistic variable voltage schedulingmodel for real-time applications, in: ICCAD’02, 2002.

[3] F. Yao, A. Demers, S. Shenker, A scheduling model for reduced CPUenergy, in: IEEE Annual Found. of C. Sc., 1995, pp. 374–382.

[4] T. Ishihara, H. Yasuura, Voltage scheduling problem for dynamicallyvariable voltage processors, in: ISLPED’98, 1998.

[5] L. Chandrasena, P. Chandrasena, M. Liebelt, An energy efficient rateselection algorithm for voltage quantized dynamic voltage scaling, in:ISSS’01, 2001.

[6] H. Aydin, R. Melhem, D. Mossé, P. Alvarez-Mejía, Power-awarescheduling for periodic real-time tasks, IEEE Trans. on Comp. 53 (5)(2004) 584–600.

[7] R. Jejurikar, R. Gupta, Energy aware non-preemptive scheduling forhard real-time systems, in: ECRTS’05, 2005, pp. 21–30.

[8] Y. Cai, M. Schmitz, B. Al-Hashimi, S. Reddy, Workload-ahead-drivenonline energy minimization techniques for battery-powered embed-

ded systems with time-constraints, in: ACM Trans. on DAES, 2006,pp. 1–23.

[9] L. Cortés, P. Eles, Z. Peng, Quasi–static assignment of voltages andoptional cycles for maximizing rewards in real-time systems with en-ergy constraints, in: DAC’05, 2005, pp. 13–17.

[10] T. Phatrapornnant, M. Pont, Reducing jitter in embedded systems em-ploying a time-triggered software architecture and dynamic voltagescaling, IEEE Trans. on Comp. 55 (2) (2006) 113–124.

[11] Philips Semiconductors, LPC2104/2105/2106; Single-chip 32-bit mi-crocontrollers, 2003.

[12] P. Merlin, D. Faber, Recoverability of communication protocols, IEEETrans. on Comm. 24 (9) (1976) 1036–1043.

[13] E. Tavares et al., Dynamic voltage scaling in hard real-time systemsconsidering precedence and exclusion relations, in: SMC’07, 2007.

[14] W. Kwon, T. Kim, Optimal voltage allocation techniques for dynami-cally variable voltage processors, in: DAC’03, 2003.

[15] P. Godefroid, Partial order methods for the verification of concurrentsystems, PhD thesis, Univ. of Liege, 1994.

[16] J. Xu, On inspection and verification of software with timing require-ments, IEEE Trans. on Soft. Eng. 29 (8) (2003) 705–720.

[17] N. Kim, M. Ryu, S. Hong, M. Saksena, C. Choi, H. Shin, Visual as-sessment of a real-time systems design: A case study on a CNCcontroller, in: RTSS’96, 1996, pp. 300–310.

[18] L. Amorim, P. Maciel, M. Nogueira, R. Barreto, E. Tavares, Mapping livesequence chart to coloured Petri nets for analysis and verification ofembedded systems, in: ACM SIGSOFT Sof. Engi. Notes, 2006, pp. 1–25.

[19] R. Prathipati, Energy efficient scheduling techniques for real-timeembedded systems, in: MSc Thesis, Texas A&M University, 2004.