Feedback EDF scheduling exploiting hardware-assisted asynchronous dynamic voltage scaling

Feedback EDF Scheduling Exploiting

Hardware-Assisted Asynchronous Dynamic Voltage Scaling ∗

Yifan Zhu and Frank MuellerDepartment of Computer Science/Center for Embedded Systems Research

North Carolina State University,Raleigh, NC [email protected], phone: +1.919.515.7889, fax: +1.919.515.7925

Abstract

Power consumption has been a major concern, bothfor processor design with high clock rates and embed-ded systems driven by batteries. Recent support for dy-namic frequency and voltage scaling (DVS) in contem-porary processor architectures allows software to affectpower consumption by varying execution frequency andsupply voltage on the fly. However, processors generallyenter a sleep state while transitioning between frequen-cies/voltages. In this paper, we examine the merits ofhardware/software co-design for a feedback DVS algo-rithm and a novel processor capable of executing instruc-tions during frequency/voltage transitions. We studyseveral power-aware feedback schemes based on earliest-deadline-first (EDF) scheduling that adjust the systembehavior dynamically for different workload characteris-tics. An infrastructure for investigating several hard real-time DVS schemes, including our feedback DVS algo-rithm, is implemented on an IBM PowerPC 405LP em-bedded board. Architecture and algorithm overhead is as-sessed for different DVS schemes. Measurements on theexperimentation board provide a quantitative assessmentof the potential of energy savings for DVS algorithmsas opposed to our prior simulation work that could onlyprovide trends. Energy consumption, measured througha data acquisition board, indicates a considerable poten-tial for real-time DVS scheduling algorithms to lower en-ergy up to 64% over the naıve DVS scheme. Our feedbackDVS algorithm saves at least as much and often consid-erably more energy than previous DVS algorithms withpeak savings of an additional 24% energy reduction.

1. IntroductionEnergy consumption has become a vital design con-

straint in embedded systems for a long time. The de-mand for efficient energy management is increasing inhand-held and embedded devices, where battery ser-vice life is usually critical to system performance. For

∗ Thiswork was supported in part by NSF grants CCR-0208581,CCR-0310860 and CCR-0312695.

many non-battery powered systems, energy consump-tion is also an important cost factor due to environ-ment issues. The processor is one of the most power-consuming devices of a computer. In order to reducethe CPU energy consumption, Dynamic Voltage Scal-ing (DVS) technology has been widely supported in re-cent years for extending battery life. DVS dynamicallyscales the processor core voltage up or down dependingon the computational demand of the system. Reducingthe supply voltage results in a lower transistor switch-ing speed, which also allows a lower clock frequency.Assuming that voltage and frequency are linearly re-lated, scaling down both voltage and frequency resultsin cubic reduction of power consumption (P ∝ V 2 ×f)[5]. While useful for simulation, this formula cannot re-flect architectural details that this study focuses on.

DVS algorithms have been intensively studied forboth non real-time and real-time systems [21, 1, 14, 7,9, 20, 24]. In the case of real-time systems, the DVS al-gorithm calculates a safe frequency that provides justenough processing resources to finish a given task be-fore its deadline. The goal is to save the maximum pos-sible amount of energy and still guarantee the schedu-lability of hard real-time systems where all tasks arerequired to meet their deadlines.

In this work, we develop several power-aware feed-back schemes for our feedback DVS algorithm based onearliest-deadline-first (EDF) scheduling, which adjustsa real-time system dynamically according to differentworkload characteristics. A feedback DVS algorithmhas been presented and evaluated in simulation exper-iments in our previous work [6, 25]. We refine thosealgorithms in this paper and develop several feedbackschemes considering practical design and implementa-tion issues on a real embedded architecture. We are in-terested in studying the performance of the DVS algo-rithm in an embedded environment where the overheadand the actual energy consumption can be measuredquantitatively. The real-time scheduler itself, when in-tegrated with a DVS algorithm, may execute at sev-eral different CPU frequencies, which also requires ac-curate modeling of the system overhead.

We examine all these issues by integrating our feed-back DVS algorithm within a real-time EDF sched-uler. An infrastructure to assess our algorithm as wellas several other DVS algorithms is implemented onan IBM PowerPC 405LP embedded board, which wasspecially modified for power management research. Aunique DVS feature supported by the test board is thatfrequency switching can be done either synchronouslyor asynchronously, both of which we evaluated exper-imentally for different DVS algorithms. We show thestrength of our feedback DVS algorithm by compar-ing the actual energy consumption with other DVS al-gorithms.

The rest of this paper is organized as follows. Sec-tion 2 gives a brief introduction of the DVS schedul-ing framework and task model. Section 3 discusses ourDVS algorithm and two feedback mechanisms proposedfor the practical environment. Detailed experimentalresults are presented in Section 4. Section 5 discussessome of the related work. Conclusions are given in Sec-tion 6.

2. EDF Scheduling with DVS SupportIn order to assess DVS algorithms for their suit-

ability and energy saving performance in an embed-ded environment, we consider the scheduling problemin hard real-time systems with the earliest deadline first(EDF) policy. The entire scheduler framework consistsof two components: (1) an EDF scheduler and (2) aDVS scheduler. These two components are indepen-dent of each other so that the EDF scheduler is ca-pable of working with different DVS algorithms. EDFis especially attractive to DVS algorithms because ofits dynamic property, which allows the DVS schedulerto exploit slack. Our DVS scheduler is based on feed-back control that incrementally adjusts system behav-ior in order to reduce energy consumption.

A periodic, fully preemptive and independent taskmodel is used in the framework. Each task Ti is definedby a triple (Pi, Ci, ci), where Pi is the period of Ti, Ci

is the measured worst-case execution time of Ti, andci is the actual execution time of Ti. Each task’s rel-ative deadline, di, is equal to its period, and all tasksare released at time zero. The periodically released in-stances of a task are called jobs. Tij is used to denotethe jth job of task Ti. Its release time is Pi ∗ (j − 1)and its deadline is Pi ∗j. cij is used to represent the ac-tual execution time of job Tij . The hyperperiod H ofthe task set is defined as the least common multiplier(LCM) among all the tasks’ periods. The schedule re-peats at the end of each hyperperiod.

In the following, we describe in detail the feedbackDVS scheduler and several feedback schemes used inthe framework.

3. Feedback DVS AlgorithmOur feedback DVS algorithm anticipates an actual

execution time of each task invocation (a job) basedon the feedback from the execution time in previousinvocations. It then splits the execution budget of atask into two parts, as depicted in Figure 1. The an-ticipated actual time CA is scaled at the lowest pos-sible frequency. Conversely, the remaining executiontime CB is scaled at the maximum frequency such thatCA + CB = WCET .

TBTA

CA/a CB

t

fm

Figure 1. Task Splitting

All future tasks are deferred as long as possible us-ing a maximal (worst-case) schedule, which is relatedto the actual schedule to derive the currently availableslack sk for task k. Thus,

α =CA

CA + sk

indicates the scaling factor and the corresponding low-est possible frequency. The algorithm is capable of cap-turing changes in actual execution times using a feed-back scheme. Preemption of the current task is antic-ipated via future slot allocations in the schedule. Itis implemented in a backward sweep to fill idle andearly completion slots from a task’s deadline back-wards (for algorithmic details, see [25]). Due to theeven more greedy approach than any of the previousschemes, the algorithm was reported to exhibit addi-tional energy savings in simulation experiments, par-ticularly for medium utilization systems, which arequite common [6]. Even more substantial savings havebeen observed for fluctuating execution times wherePID-feedback provides new opportunities for aggres-sive scaling [25]. During the implementation of the al-gorithm for the 405LP embedded board, we refined thefeedback scheme proposed in [25] and developed the fol-lowing two feedback mechanisms.

3.1. Simple Feedback

Some of the periodic real-time workloads have a rel-atively stable behavior during certain intervals. Theactual execution time of their different jobs remainsnearly constant or varies only within a very small rangein that interval. For such workloads we use a very sim-ple feedback mechanism by computing the moving av-erage of previous jobs’ actual execution times and feedit back to the DVS scheduler. We try to avoid theoverhead of a more complicated feedback mechanism,

such as the PID-feedback controller described in thenext section, because a simple feedback usually pro-vides good enough performance in this case. The quan-titative comparison of the overhead between our PID-feedback DVS algorithm and several other DVS algo-rithms (detailed in Section 4) also makes us believethat a complicated feedback DVS scheme degrades itsenergy saving potential to some extend.

The simple feedback mechanism chooses the value ofCA as the controlled variable. Each job Tij ’s actual ex-ecution time cij is chosen as the set point. CA is as-signed to be 50% WCET for the first job of each task,which means half of the job’s execution is budgeted ata low frequency, and half of it is reserved at the maxi-mum frequency. The maximum frequency portion guar-antees the deadline requirements, even if the worst-caseexecution time is used in full. Each time a job com-pletes, its actual execution time is fed back and aggre-gated to anticipate the next job’s CA. Let CAij denotethe CA value for Tij . The (j +1)th job of the task is as-signed a CA value according to:

CAi(j+1) = (CAij + ci(j+1) − ci(j−N+1))/N (1)

where N is a constant representing the number ofitems used in the moving average calculation. Our ex-periments show significant energy savings for such asimple feedback mechanism with very low schedulingoverhead as long as the workload’s actual executiontime exhibits a stable behavior during some interval.When the workload’s behavior keeps changing dynam-ically with highly fluctuating execution times, simplefeedback becomes not enough to yield the best energysavings. In those cases, a more sophisticated feedbackmechanism is required, as detailed in the next section.

3.2. PID Feedback

The original PID-feedback DVS mechanism, as pre-sented in [25], requires the DVS scheduler to cre-ate and maintain multiple independent feedback con-trollers for each of the tasks in the workload. Multi-ple inputs and multiple outputs need to be manip-ulated simultaneously by the DVS scheduler. Such aPID-feedback mechanism, albeit its potential for en-ergy savings shown in our previous simulation exper-iments, results in substantial execution overhead onan embedded architecture. Giving the difficulty of pre-cisely characterizing the behavior of a multiple-inputmultiple-output control system, it also adds complex-ity to the theoretical analysis of the algorithm. There-fore, we refine the original PID-feedback DVS mecha-nism by the following simplified design.

Instead of using CAi(i = 1...n) as the controlled vari-able for each task Ti and creating n different feedbackcontroller for n different tasks, we now define a sin-

+

−

r0 rsMrKp+Ki/s+sKd CA

Figure 2. Control Loop Model

gle variable r as the controlled variable for the entiresystem as:

r =1

n

n∑

i=1

CAij − cij

cij

(2)

where j is the index of the latest job of task Ti be-fore the sampling point. Our objective is to make r ap-proximate 0 ( i.e., the set point). The system error be-comes

ε(t) = r − 0. (3)

ε(t) is fed back to the PID scheduler to regulate thecontrolled variable r. The PID feedback controller isnow defined as:

∆rj = Kpεi(t) + 1Ki

∑IW εi(t) + Kd

εi(t)−εi(t−DW )DW

rj+1 = rj + ∆rj

(4)For each rj , we adjust the CA value for task Ti by

CAi(j+1) = rjcij + cij . The transfer function Gr be-tween r and CA can be derived by taking derivative ofboth sides of the equation 2:

Gr(s) = Mrs (5)

where Mr = 1n

∑n

i=11ci

. The block diagram of themodel is shown in Fig. 2. Its transfer function is :

GP (s)Gr(s)

1 + GP (s)Gr(s)=

MKps + MKi + MKds2

1 + MKps + MKi + MKds2(6)

According to control theory, a system is stable ifand only if all the poles of its transfer function are inthe negative half-plane of the s-domain. From Equa-tion 6, we infer the poles of our system as

−MKp ±√

MK2p − 4MKd(MKi + 1)

2MKd

(7)

Note that −MKp +√

MK2p − 4MKd(MKi + 1) is

still less than 0 when MK2p − 4MKd(MKi + 1) > 0.

Hence, all the poles are in the negative half-plan of thes-domain. Therefore, the stability of the above systemis ensured.

Such a single controller mechanism is easy to imple-ment because just one feedback controller suffices forthe entire system, which reduces the complexity andoverhead of the feedback DVS algorithm. But it alsohas its drawback, i.e., it does not provide direct feed-back information of the CA value for each individualtask. When r equals zero, one cannot imply that everytask’s CA has approximated its actual execution time.

It is an imprecise description of the original schedulingobjective and may take longer to get the system into astable status. Nonetheless, our experiment shows sig-nificant energy savings of this feedback DVS mecha-nism with much reduced overhead compared to otherDVS algorithms. In the next section, we present the de-tails of our experimental results.

4. Experimental EvaluationBy evaluating our feedback DVS algorithm on a

real embedded architecture, we assess the true po-tential of our algorithm for energy savings in an ac-tual system as opposed to a simulation environment.Also, we compare the overhead and energy consump-tion between our algorithm and several other DVS al-gorithms, namely static DVS, cycle-conserving DVS,look-ahead-1/2 DVS (all by Pillai and Shin [21]) aswell as DR-OTE and AGR-2 (by Aydin et al. [1]).Look-ahead-1 and look-ahead-2 are the original and amodified version of the original look-ahead DVS algo-rithm in [21], respectively. Look-ahead-1 updates eachtask’s absolute deadline immediately when a task in-stance completes. Look-ahead-2 delays such update tillthe next task instance is released, which results in addi-tional energy savings. AGR-2 follows the most aggres-sive scheme presented in [1] with an aggressiveness pa-rameter k of 0.9. In these experiments, we also wantedto determine if the lower frequencies and voltages cho-sen by our feedback scheme outweigh the higher com-putational overhead required to make scheduling deci-sions.

4.1. Platform and Methodology

The embedded platform used in our experiment is aPowerPC 495LP embedded board running on a disklessMontaVista Embedded Linux variant, which is basedon the 2.4.21 stock kernel but has been patched to sup-port DVS on the PPC 405LP. This board provides thehardware support required for DVS and allows soft-ware to scale voltage and frequency via user-definedoperation points ranging from a high end of 266 MHzat 1.8V to a low end of 33 MHz at 1V [19, 4, 10].The board has also been modified for 50% reduced ca-pacitance, which allows DVS switches to occur morerapidly, i.e., switches are bounded by at most a 200microseconds duration from 1V to 1.8V. The DVS al-gorithms (static, cycle-conserving, look-ahead [21] andour feedback DVS) were exposed to the DVS capabil-ities of the 405LP board. The scheduling algorithmscan choose any frequency/voltage pair from the set de-picted in Table 1.

This set of pairs was constrained by a need to havea common PLL multiplier of 16 relative to the 33MHzbase clock and a divider of two or any multiple of 4.Changing the multiplier incurs additional overhead for

Setting 0 1 2 3 4CPU freq. (MHz) 33 44 66 133 266bus freq. (MHz) 33 44 66 133 133

CPU voltage (Volts) 1.0 1.0 1.1 1.3 1.7

Table 1. Valid Frequency/Voltage Pairs

switching, which we wanted to eliminate in this study.A dynamic power management (DPM) facility [4] is de-veloped as an enhancement to the Linux kernel to sup-port DVS features. DPM operating point defines stablefrequency/voltage pairs (as well as related system pa-rameters), which we experimentally determined.

In order to assess power consumption, we need tomonitor processor core voltage and current at a highrate. Hence, we used a high-frequency analog data ac-quisition board to gather data for (a) the processorcore voltage and (b) the processor current. The lat-ter was measured as a voltage level over a resistor witha 1V drop per 360mA. Power consumption was com-puted by multiplying the CPU voltage with its current.Data acquisition board allowed us to experiment withlonger-running applications to assess the energy con-sumption of the processor, which is the integration ofpower over time. We also employed an oscilloscope forvisualizing the voltages and currents with high preci-sion in readings.

We implemented an EDF scheduler as a user-levelthread library under Linux on the 405LP board. A user-level library was chosen over a kernel-level solution be-cause of the simplicity of its design and the fact thatthe operating system background activity is minimalon the embedded board infrastructure. Different DVSscheduling schemes were attached into the EDF sched-uler as independent modules.

4.2. Synchronous vs. Asynchronous Switch

We first assessed the overhead of different DVS tech-niques supported by the test board and the dynamicpower management extensions of the operating system.

A unique DVS feature supported by the IBM PPC405LP embedded board is that frequency switchingcan be done either synchronously or asynchronously.Synchronous switching is the traditional approach forprocessor frequency/voltage transitions, where appli-cations have to stop execution during the transi-tional interval. Asynchronous switching, on the con-trary, allows application to continue execution duringthe frequency/voltage transitions. Figure 3 depicts thechanges in current (lower curve) and voltage (uppercurve) of the PPC 405LP processor core during anasynchronous switch.

This unique feature of asynchronous switching isachieved by a system call that, when switching to ahigher voltage/frequency, first reprograms the voltage

Figure 3. Current and Voltage Transition DuringAsynchronous Frequency Switching

to ramp up towards the maximum as fast as possible(the 30 degree voltage ramp on the upper curve of Fig-ure 3). Meanwhile, the time to reach a voltage level atleast as high as required by the new frequency is esti-mated. A high-resolution timer is programmed to inter-rupt when this duration expires, prior to which the ap-plication can still continue execution. Once the timerinterrupt triggers its handler (at the peak after the 30degree ramp on the upper curve), the power manage-ment unit is reprogrammed to settle at the target volt-age level, and the new processor frequency is activatedbefore returning from the handler. The voltage thensettles (in case it overshot) in a controlled manner tothe new operating point. The current also settles ina controlled manner depending on the actual process-ing activity.

Table 2 reports the overhead for synchronous andasynchronous switching in a time range bounded bytwo extremes: (a) Switching between adjacent fre-quency/voltage levels and (b) switching between thelowest and highest frequency/voltage levels. Further-more, the overhead of the subsequent signal handlerassociated with each asynchronous switch is also mea-sured for a range of the highest and the lowest proces-sor frequencies. The results indicate that a synchronousDVS switch has about an order of a magnitude largeroverhead than an asynchronous switch. The timer in-terrupt handler triggered at each asynchronous switchonly increases the overall overhead insignificantly.

activity sync. DVS async. DVS signal handler

overhead 117-162 µsec 8-20 µsec 0.07-0.6 µsec

Table 2. DVS Switching Overhead

4.3. DVS Scheduler Overhead

We compared the overhead of our feedback-DVS al-gorithm with several other dynamic DVS algorithms.We first measured the execution time of these DVS

scheduling algorithms under different frequencies onthe embedded board, as depicted in Table 3. The over-head was obtained by measuring the amount of timewhen a task issues a yield() system call till anothertask was dispatched by the scheduler. The table showsthat static DVS has the lowest overhead among thefour while our PID-feedback DVS has the highest one.This is not surprising since static DVS uses a very sim-ple strategy to select the frequency and voltage fallingshort in finding the best energy saving opportunities.Cycle-conserving DVS, look-ahead DVS and our PID-feedback DVS use more sophisticated and aggressive al-gorithms for lower energy consumption, albeit at higheroverheads. The trade-off between overhead and perfor-mance always needs to be examined carefully.

Next, we assessed if our feedback-DVS algorithm, al-though incurring the largest overhead among the four,gives the best energy saving results in the real em-bedded environment. We measured the actual energyconsumption of these DVS algorithms when executingthree medium utilization task sets depicted in Table 4using both synchronous and asynchronous DVS switch-ings. As a baseline for comparison, we also implementeda naıve DVS scheme where the maximum frequency isalways chosen whenever a task is scheduled, and theminimum frequency is always chosen whenever the sys-tem is idle.

DVS scheduling overhead[µsec]CPU freq. static cc look-ahead PID-feedback

33 MHz 217 487 2296 361244 MHz 170 366 1714 294366 MHz 100 232 1112 1728133 MHz 52 120 546 801266 MHz 36 76 229 472

Table 3. Overhead of DVS-EDF Scheduler

The first task set in Table 4 is harmonic, i.e., all pe-riods are integer multiples of the smallest period, whichfacilitates scheduling. This often allows scheduling al-gorithms to exhibit an extreme behavior, typically out-performing any other choice of periods. The secondand third task sets are non-harmonic with longer andshorter periods, respectively. Actual execution timeswere half that of the WCET for each task for this ex-periment.

Table 5 depicts the energy consumption in a unitof mWatt-hours. The naıve DVS algorithm serves as abase of comparisons for each of the subsequent DVSalgorithms. For task set one, static DVS reduces en-ergy consumption by about 29% over the naıve scheme.Cycle-conserving DVS saves 47% energy. Look-aheadRT-DVS saves over 50%, and our feedback methodsaves about 54% energy compared to naıve DVS. This

Task Set 1 Task Set 2 Task Set 3task Period (Pi) WCET (Ci) Period (Pi) WCET (Ci) Period (Pi) WCET (Ci)1 2,400 400 600 80 90 122 2,400 600 320 120 48 183 1,200 200 400 40 60 6

Table 4. Task Set, times in msec

algorithm naıve static static save cycle-cons. c-c save look-ahead l-a save our feedback fdbk saveTask Set 1

sync. 4.47 3.2 28.41% 2.38 46.61% 2.21 50.56% 2.04 54.21%async. 4.43 3.13 29.35% 2.327 47.51% 2.12 52.07% 2.00 54.70%savings 0.89% 2.19% 2.51% 3.92% 1.95%

Task Set 2sync. 0.544 0.5056 7.06% 0.4713 13.36% 0.424 22.06% 0.4089 24.83%async. 0.5276 0.5025 4.76% 0.4622 12.40% 0.4218 20.05% 0.4064 22.97%savings 3.01% 0.61% 1.93% 0.52% 0.61%

Task Set 3sync. 0.595 0.5616 5.61% 0.4799 19.34% 0.4043 32.05% 0.3708 37.68%async. 0.5802 0.5496 5.27% 0.4547 21.63% 0.3912 32.57% 0.3671 36.73%savings 2.49% 2.14% 5.25% 3.24% 1.00%

Task Set 2 vs. Task Set 3change 9.07% 8.57% -1.65% -7.82% -10.71%

Table 5. Energy [mW − hrs] consumption per RT-DVS algorithm

clearly shows the tremendous potential in energy sav-ings for real-time scheduling.

The savings for each algorithm are lower for task settwo peaking at about 23% for our feedback scheme. Asmentioned before, task set one is harmonic, which typi-cally results in the best scheduling (and energy) resultssince execution is more predictable. Task set three liesin between the other two with peak savings of 37% forour feedback scheme.

The results also demonstrate that the overhead forcalculations inherent to scheduling algorithms is out-weighed by the potential for energy savings. This is un-derlined by the increasing overhead in execution timefor each of the scheduling algorithms (from left to rightin Table 5) accompanied by decreasing energy con-sumption.

Another noteworthy result is the comparison be-tween synchronous and asynchronous DVS switchingdepicted in the last row for each task set in Table 5.For each of the scheduling algorithms, we see addi-tional savings of 1-5% on asynchronous switching dueto the ability to commence with a task’s execution dur-ing frequency and voltage transitions. We also ran ex-periments with task sets that had an order of a magni-tude smaller periods and execution times. Surprisingly,the synchronous vs. asynchronous savings remained ap-proximately the same, even though DVS switches oc-cur ten times as often. We believe that the periods and

execution time settings used in our experimental envi-ronment are still large compared to the execution timeof a synchronous or asynchronous switching. If we onlysave about 100 µsec at each frequency switch (as hasbeen shown in Table 2) but later on spend more then10-100 msec in running a task, the benefit of the asyn-chronous DVS switching becomes insignificant. Theseresults seem to indicate that the benefit of continuousexecution during DVS switching, although not negligi-ble, is secondary to trying to minimize the overhead ofDVS scheduling itself.

We also compared task sets two and three in terms oftheir absolute energy readings, which is valid since theyexecuted for the same amount of time (ten seconds),the same actual to worst-case execution time rationand the same utilization, albeit at seven times morecontext switches. This change is depicted in the lastrow of Table 5 for the asynchronous case. Not surpris-ingly, the energy with naıve DVS is about 9% higherfor task set three than for set two due to the highercontext switch overhead of the latter. Quite interest-ingly, this overhead turns into a reduction in energy asDVS schemes become more aggressive.

4.4. Impact of Different Workloads

We also examined the behavior of our DVS algo-rithm on different workloads in more detail. For thispurpose, we devised a suite of task sets with syntheticCPU workloads. Each task set contains three indepen-

dent periodic tasks whose worst-case execution timevaries from 0.1 to 0.9 with an increment of 0.1. The ac-tual execution time of a task is determined by tim-ing the body of each task plus the scheduler over-head (see Table 3) of the corresponding DVS algo-rithm under the lowest CPU frequency. We dynam-ically changed the number of instructions inside eachtask body among different invocations, i.e., jobs, to ap-proximate the workload fluctuation behavior of actualreal-time applications.

Altogether, four synthesized execution patterns werecreated. For the first pattern, a task’s actual executiontime is always 50% WCET. For the second pattern, theactual execution time of a task drops exponentially be-tween a peak value and 50%WCET among its consec-utive jobs, modeled as ci = 1/2(t−cm). The peak valuecm was chosen to be 20%WCET. This pattern sim-ulates event-triggered activities that result in sudden,yet short-term computational demands due to complexinputs often observed in interrupt-driven systems. Thethird pattern is similar to the second one except that itdrops more gradually, modeled as ci = cmsin(t+π/2).This pattern simulates events resulting in computa-tional demands in a phase of subsequent complex in-puts with a decaying tendency. For the fourth pat-tern, the actual execution time of a task increases anddecreases gradually around 50% WCET, modeled asci = cmsin(t) and ci = −cmsin(t). This pattern repre-sents periodically fluctuating activities with graduallyincreasing and decreasing computational needs aroundpeaks. We used simple feedback on pattern 1 becauseof its nearly constant execution time pattern amongdifferent jobs. The number of items to compute themoving average was set as N = 10. PID-feedback wasused on patterns 2, 3, and 4 to exploit fluctuating exe-cution time characteristics. The PID parameters werechosen by tuning manually with Kp = 0.9, Ki = 0.08,Kd = 0.1. The derivative and integral window size were1 and 10, respectively. Asynchronous switching was al-ways used in this experiment since it has a better per-formance than synchronous switching.

Figure 4 and Figure 5 present the energy consump-tion of our feedback-DVS as well as four other dy-namic DVS algorithms under task execution pattern2. The number of tasks in the task set varies between3 and 30 tasks. All energy values are normalized to thenaıve DVS results. AGR-2 dynamically reclaims un-used slack up to the next arrival time of any task in-stance (NTA), hence saving about 50% extra energythan naıve DVS. AGR-2 is not as good as Look-ahead-1/2 DVS for 3 tasks since it considers slack only upto the next task instance’s deadline, while Look-aheadDVS collects slack up to the largest deadline among

�

��

��

��

��

�

��

��

��

��

��

��

��

��

�

��

Figure 4. PID feedback for 3-task sets with dy-namic exec. time pattern 2

�

��

��

��

��

�

��

��

��

��

��

��

��

��

�

��

Figure 5. PID feedback for 30-task sets with dy-namic exec. time pattern 2

all tasks. But AGR-2 benefits from smaller task gran-ularity in 30-task sets and outperforms Look-ahead-1for extreme utilizations (small and large) except forthe range of 0.5-0.7 utilization. When compared withLook-ahead-2, AGR only outperforms it for hight uti-lizations, otherwise Look-ahead-2 performs better.

Our feedback-DVS shows additional benefits overboth Look-ahead-2 AGR-2. Relative to the twoschemes, we save another 5%-20% energy due to ouralgorithm’s self-adaptation to jobs’ actual executiontimes. In cases of extremely low utilization, feedback-

��

��

��

��

��

��

��

��

��

��

��

��

��

�

��

� ��

Figure 6. Simple feedback for task sets with con-stant exec. time pattern 1

��

��

��

��

��

��

��

��

��

��

��

��

�

��

� ��

Figure7.PID feedback for task setswithdynamicexec. time pattern 2

��

��

��

��

��

��

��

��

��

��

��

��

�

��

� ��

� ��


DVS, Look-ahead DVS and AGR-2 are observed to re-sult in virtually the same energy savings because everytask has enough slack to run at the minimum speed re-sulting in the same frequencies for a schedule irrespec-tive of the DVS algorithm.

Let us now focus on the comparison of our algo-rithm with the look-ahead-2 DVS algorithm for 3-tasksets, but under different dynamic execution time pat-terns. Figure 6 shows that when each task has a nearly

��

��

��

��

��

��

��

��

��

��

��

��

�

��

� ��

� ��


��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

��

Figure 10. Pattern 4 with Different Avg. Exec.Times – Energy Normalized to Naıve DVS

constant execution time among different instances, oursimple feedback DVS saves up to 19% more energy thanlook-ahead-2 DVS. We notice a reduction in energy sav-ings when utilization becomes extremely low or high.Such extreme utilizations force all tasks to run at theminimum or the maximum frequency. On average, thesimple feedback DVS outperforms look-ahead by 10%and outperforms naıve DVS by about 50%.

Figures 7-9 shows the energy consumption with dy-namic execution time patterns 2, 3 and 4. The maximalsavings over look-ahead-2 are 11%, 15% and 20%, re-spectively. The maximal savings over naıve DVS are be-tween 25% and 60%. The largest savings again happenin median utilization cases where there is considerabledynamic slack. PID-feedback mechanism helps capturethe dynamic behavior of task execution times and tosubsequently scale power even more aggressively thanother DVS algorithms. Although look-ahead DVS canalso take advantage of dynamic slack and lower the fre-quency/voltages more aggressively than naıve DVS, itlacks a feedback scheme to adjust its behavior dynam-ically. From time to time, it has to overcome the factthat the frequency was lowered too much in the past byraising the voltage and frequency to a level even higherthan the safe frequency.

To better assess the scalability of our feedback-DVSalgorithm, we further measured the energy consump-tion of the three task sets under pattern 4 with dif-ferent average execution times. Figure 10 shows nor-malized energy consumption relative to naıve DVS for0.75WCET, 0.5WCET, and 0.3WCET. Our algorithm

can scale equally well for loose and tight execution-time patterns. In all three cases, 14% to 24% addi-tional energy is saved than for look-ahead DVS. OurPID-feedback mechanism shows even better strengthfor median and tight execution time cases than theloose execution time case because capturing the dy-namic behavior of a task’s actual execution time in atight execution environment is more critical than in aloose environment.

Figure 11 depicts screen-shots of voltage and currentobtained from the oscilloscope for the phase just aftera simultaneous release of all tasks at the beginning ofa hyperperiod. Static DVS shows two levels of voltages(busy/idle time) whereas cycle-conserving DVS differ-

2V

1V t

0mA

360mA

(a) static RT-DVS EDF

2V

1V t

0mA

360mA

(b) cycle-conserving RT-DVS EDF

2V

1V t

0mA

360mA

(c) look-ahead RT-DVS EDF

2V

1V t

0mA

360mA

(d) our feedback RT-DVS EDF

Figure 11. Voltage/current oscilloscopeshot with loose WCET: WCET =2 × ActualExecT ime, Utilization U = 0.5

entiates three levels on a dynamic base. Even lowervoltage and current readings are given by look-aheadDVS, which not only distinguishes more levels but alsoexhibits much lower power levels during load. The low-est results were obtained by our feedback DVS, whichdefers execution even more aggressively than any of theother methods. However, our feedback scheme can onlyfurther reduce power consumption occasionally as suf-ficient slack exists to be recovered by the algorithms ofthe previous schemes. Dynamic slack is recovered in in-creasing levels by the latter three schemes.

4.5. Comparison with Simulation Results

When we compare the energy saving results ob-tained from the IBM 405LP embedded board withour previous simulation results presented in [25], weclearly see the advantage and disadvantage of simula-tion for power-aware studies. The advantage of simula-tion lies in its ease of implementation and predictabil-ity of performance trends. The energy consumption ofdifferent DVS algorithms show a consistent trend un-der both simulation and the actual embedded platform.But quantitative results differ. Our previous simulationresults reported 5%-10% higher savings on average. Forexample, the best energy saving of our feedback DVSover look-ahead DVS was report as 29% in simulationwhile the best result we measured from the test boardis around 24%. It is also non-trivial to model the actualpower/energy consumption in simulation without con-sidering actual hardware details. This is also the casewhen evaluating the overhead. Since the overhead ofDVS algorithms was not included in our previous sim-ulation experiment, we still observed 7%-10% energysavings over look-ahead DVS even at high utilizationcases. But the actual energy measurement from the testboard show only 3%-6% savings for these cases.

Overall, our experiments on the embedded platformquantitatively show the potential of our feedback DVSalgorithm and its ability to scale power even more ag-gressively than previous DVS algorithms.

5. Related WorkDynamic voltage scaling for real-time systems has

received considerable attention in recent years. Vari-able processor speed opens novel opportunities for real-time scheduling. Pillai and Shin present a suite of DVSalgorithms integrated with hard real-time EDF andRM scheduling [21]. Processor speed for each task isadjusted dynamically while the schedulability of thesystem is still reserved. Look-ahead DVS is the mostaggressive DVS scheme among the suite of algorithmsproposed. Aydin et al. discuss a series of dynamic recla-mation algorithms, which reclaim unused computationtime of real-time tasks to reduce the processor speed[2]. Energy-aware scheduling of hybrid workloads, in-

cluding both periodic and aperiodic tasks, are furtherinvestigated by Aydin and Yang in [3]. Gruian analyzesa dual-speed DVS schedule based on stochastic data de-rived from past task execution traces [8]. Jejurikar andGupta investigate static and dynamic slowdown fac-tors for periodic tasks [12] and combine it with pro-crastination scheduling [13] and preemption thresholdscheduling [11] for DVS.

The potential of feedback control on real-timescheduling was first investigated by Stankovic et al.[22]. Real-time system performance specifications areanalyzed systematically through a control-theoreticalmethodology by Lu et al. [15]. A feedback-control real-time scheduling framework for unpredictable dynamicreal-time systems is further proposed by Lu et al. wheretask execution times diverge from their worst case [16].Dynamic models of real-time systems are developedto identify different categories of real-time applicationswith different feedback control algorithms.

Feedback control was also proposed for energy-awarecomputing in previous work, such as those by Varma[23], Lu [17] and Minerick [18]. Varma et al. present afeedback-control algorithm where the previous work-load execution history is used to predict the futureworkload behavior by a discrete-time PID function.The combination of the proportional, integral andderivative part of the PID function provides good es-timation across different applications insensitive of thechange of their parameters [23]. Lu et al. describe aformal feedback-control algorithm combined with dy-namic voltage/frequency scaling technologies. WhileVarma and Lu’s work targets soft real-time/multimediasystems, our feedback DVS scheme focuses on hardreal-time system where timing constraints must not beviolated. A general energy management scheme withfeedback control is proposed by Minerick et al. [18]. Av-erage energy usage is achieved by continuously adjust-ing the voltage/frequency of a processor to meet theenergy consumption goal. The objective of their workis to obtain low energy consumption for general pur-pose systems while our work targets hard real-time sys-tems with deadline requirements.

6. ConclusionIn this paper, we presented feedback DVS algorithm

considering practical design and implementation issues.We evaluated it as well as several other real-time DVSalgorithms on an IBM 405LP embedded platform. Aunique DVS feature of this platform is asynchronousfrequency switching, which supports continued exe-cution during voltage/frequency transitions. We haveshown up to 5% energy savings of asynchronous switch-ing for fast DVS modulation without entering sleepmodes as opposed to traditional synchronous switch-

ing. We assessed the benefits of our feedback DVS al-gorithm by measuring the energy consumption overthe hyperperiod of real-time tasks. Energy consump-tion as well as scheduling overhead between differentDVS schemes were compared with each other. The ex-perimental results indicate that our aggressive feedbackDVS scheduling algorithm achieves up to 24% addi-tional savings in energy consumption over the look-ahead DVS and AGR-2 algorithms and up to 64% en-ergy savings over the naıve DVS scheme when consid-ering scheduling overheads.

AcknowledgmentsAjay Dudani contributed to early work of the

Feedback-DVS scheme [6]. Aravindh V. Anantara-man, Ali El-Haj Mahmoud, Ravi K. Venkatesan de-signed and implemented an early version of theDVS experimentation environment. Bishop Brock fromIBM provided most valuable technical details for thePPC 405LP board, which was donated by IBM Re-search (Austin). This work was supported in part byNSF grants CCR-0208581, CCR-0310860 and CCR-0312695.

References[1] H. Aydin, R. Melhem, D. Mosse, and P. Mejia-Alvarez.

Dynamic and agressive scheduling techniques for power-aware real-time systems. In IEEE Real-Time SystemsSymposium, Dec. 2001.

[2] H. Aydin, R. Melhem, D. Mosse, and P. Mejia-Alvarez.Power-aware scheduling for periodic real-time tasks.IEEE Trans. Comput., 53(5):584–600, 2004.

[3] H. Aydin and Q. Yang. Energy-responsiveness tradeoffsfor real-time systems with mixed workload. In Proceed-ings of the 11th IEEE Real-Time and Embedded Tech-nology and Applications Symposium, May 2004.

[4] B. Brock and K. Rajamani. Dynamic power manage-ment for embeddedsystems. In IEEE International SOCConference, Sept. 2003.

[5] A. Chandrakasan, S. Sheng, and R. W. Brodersen. Low-power cmos digital design. In IEEE Journal of Solid-State Circuits, Vol. 27, pp. 473-484., April, 1992.

[6] A. Dudani, F. Mueller, and Y. Zhu. Energy-conservingfeedback edf scheduling for embedded systemswith real-time constraints. In ACM SIGPLAN Joint ConferenceLanguages, Compilers, and Tools for Embedded Systems(LCTES’02) and Software and Compilers for EmbeddedSystems (SCOPES’02), pages 213–222, June 2002.

[7] K. Govil, E. Chan, and H. Wasserman. Comparing algo-rithms for dynamic speed-setting of a low-power cpu. In1st Int’l Conference on Mobile Computing and Network-ing, Nov 1995.

[8] F. Gruian. Hard real-time scheduling for low energy us-ing stochastic data and dvs processors. In Proceedingsof the International SymposiumonLow-PowerElectron-ics and Design ISLPED’01, Aug 2001.

[9] D. Grunwald, P. Levis, C. M. III, M. Neufeld, andK. Farkas. Policies for dynamic clock scheduling. InSymp. on Operating Systems Design and Implementa-tion, Oct 2000.

[10] IBM and M. Software. Dynamic power management forembedded systems. white paper.

[11] R. Jejurikar and R. Gupta. Integrating preemptionthreshold schedulinganddynamicvoltage scaling for en-ergy efficient real-time systems. In Proceedings of the10th International Conference on Real-Time and Em-bedded Computing Systems and Applications (RTCSA’04), 25-27 Aug 2004.

[12] R. Jejurikar and R. Gupta. Optimized slowdown in real-time task systems. In Proceedings of the 16th Euromi-cro Conference on Real-Time Systems (ECRTS ’04 ),Jun30-Jul2 2004.

[13] R. Jejurikar and R. Gupta. Procrastination schedulingin fixed priority real-time systems. In Proceedings of theLanguage Compilers and Tools for Embedded Systems,Jun 2004.

[14] D. Kang, S. Crago, and J. Suh. A fast resource synthe-sis technique for energy-efficient real-time systems. InIEEE Real-Time Systems Symposium, Dec. 2002.

[15] C. Lu, J. A. Stankovic, T. F. Abdelzaher, G. Tao, S. H.Son, and M. Marley. Performance specifications andmetrics for adaptive real-time systems. In Proceedingsof the IEEE Real-Time Systems Symposium, December2000. 2000.

[16] C. Lu, J. A. Stankovic, G. Tao, and S. H. Son. Feedbackcontrol real-time scheduling: Framework, modeling, andalgorithms. Real-Time Syst., 23:85–126, 2002.

[17] Z. Lu, J. Hein, M. Humphrey, M. Stan, J. Lach, andK. Skadron. Control-theoretic dynamic frequency andvoltage scaling for multimedia workloads. In Interna-tionalConference onCompilers,Architectures, andSyn-thesis for Embedded Systems, pages 156–63, 2002.

[18] R. Minerick, V. W. Freeh, and P. M. Kogge. Dynamicpower management using feedback. In Proceedings ofWorkshop on Compilers and Operating Systems for LowPower, 2002.

[19] K. Nowka, G. Carpenter, and B. Brock. The design andapplication of the powerpc 405lp energy-efficient systemon chip. IBM Journal of Research and Development,47(5/6), September/November 2003.

[20] T. Pering, T. Burd, and R. Brodersen. The simulationof dynamic voltage scaling algorithms. In Symp. on LowPower Electronics, 1995.

[21] P. Pillai and K. Shin. Real-time dynamic voltage scal-ing for low-power embedded operating systems. In Sym-posium on Operating Systems Principles, 2001.

[22] J.A.Stankovic,C.Lu,S.H.Son,andG.Tao. Thecase forfeedback control real-time scheduling. In Proceedings ofthe EuroMicro Conference on Real-Time Systems, June1999.

[23] A.Varma,B.Ganesh,M. Sen, S.R.Choudhury, L. Srini-vasan, and J. Bruce. A control-theoretic approach to dy-namic voltage scheduling. In Proceedings of the 2003 in-ternational conference on Compilers, architectures and

synthesis for embedded systems, pages 255–266. ACMPress, 2003.

[24] M. Weiser, B. Welch, A. Demers, and S. Shenker.Scheduling for reduced cpu energy. In 1st Symp. on Op-erating Systems Design and Implementation, Nov 1994.

[25] Y. Zhu and F. Mueller. Feedback edf scheduling exploit-ing dynamic voltage scaling. In IEEE Real-Time Em-bedded Technology and Applications Symposium, pages84–93, May 2004.

Feedback EDF scheduling exploiting hardware-assisted asynchronous dynamic voltage scaling

Documents

Transcript of Feedback EDF scheduling exploiting hardware-assisted asynchronous dynamic voltage scaling