Top Banner
System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg. Wayne State University Detroit, Michigan http://www.cic.eng.wayne.edu
24

System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

System-Wide Energy Minimization for Real-Time Tasks: Lower Bound

and Approximation

Xiliang Zhong and Cheng-Zhong Xu

Dept. of Electrical & Computer Engg. Wayne State University

Detroit, Michiganhttp://www.cic.eng.wayne.edu

Page 2: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

2

Outline Introduction

Processor and system energy model Related Work System-Wide Energy Optimization for

periodic tasks The optimal algorithm A fully polynomial time approximation scheme Performance Evaluation

System-Wide Energy Optimization for sporadic Tasks Solution and evaluation

Conclusions

Page 3: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

3

Introduction Mobile/Embedded devices are power

critical, with limited battery capacity

Software assisted power management Dynamic power management (DPM)

Resource shutdown after a timeout

Dynamic voltage/frequency scaling (DVS) Processing speed designed for peak

performance Slowdown the processor voltage / speed when

not fully utilized

Page 4: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

4

0

0.5

1

0.2 0.4 0.6 0.8 1

Normalized CPU speed

Ener

gy p

er c

ycle

DVSNo-DVS

Dynamic voltage scaling (DVS)

The dynamic CPU power is , P ∝ v2f

Reducing v also reduce the maximum processors frequency

Approximately, energy per cycle∝ f2

Processor slowdown leads to super-linear energy

savings, while linear execution time increase

Energy per cycle of PXA processor

Page 5: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

5

System-Wide Energy

Processor also has leakage power Applications may use other components such as

memory and peripheral devices Can be in active, standby, sleep, and shutdown

states System-wide energy consumed in running a

task CPU, resource standby and active energy

Lowering CPU frequency can increase overall energy expenditure due to prolonged resource standby time of other components

Page 6: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

6

System-Wide Energy (cont.)

critical speed, the speed with minimum energy per cycle Not energy

efficient using lower speed

Execute a task at speed no lower than its critical speed, then put the devices into low power state A combined use of slowdown and shutdown

0

2

4

6

8

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Normalized speed

Ene

rgy

per

cycl

e Processor onlyStandby power 0.2 WStandby power 0.6 WStandby power 1.2 W

x10-9

Energy per cycle of PXA processor with different standby power

Page 7: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

7

Related Work CPU energy minimization for periodic tasks:

Heuristics [Mejia-Alvarez’04], approximations [Chen and Kuo’05] Few studies on system-wide energy minimization

Applications w/o deadlines Subject to a performance loss [Choi et al.’04]

Real-time periodic tasks on CPU w/ continuous speed levels

Heuristics [Zhuo and Chakrabarti’05] Real-time periodic tasks on CPU w/ discrete speed levels

Heuristics [Jejurikar and Gupta’04] This work

Pseudo-polynomial algorithm for optimal solutions and polynomial approximated schemes

Applicable to both offline periodic tasks and online sporadic tasks in processors with practical discrete levels

Page 8: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

8

System-wide energy optimization

Periodic Tasks (Offline) : worst case execution time under max

speed : task period and deadline : normalized speed of task

Sporadic Tasks (Online) Task releases have irregular intervals Online scheduling based on uncompleted tasks,

no assumption about future task releases The objective is to minimize

overall energy consumption including CPU and all other system components while meeting deadline constraints of all the tasks

Page 9: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

9

Energy Minimization for Periodic Tasks

Minimization of energy consumption for n periodic tasks in a hyper-period,

Feasible constraint under EDF

Boundary constraint Practical processors with discrete speed levels

The minimization is an NP-hard Multiple Choice KnapSack (MCKP) problem

There exist pseudo-polynomial solutions to MCKP with integer coefficients, not applicable in this problem

Page 10: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

10

An Example Basic idea: first solve subprobs with fewer #tasks A system with an PXA processor with 5

normalized speed [0.15 0.4 0.6 0.8 1] System with memory, flash, and WNIC An example real-time workload w/ 4 periodic

tasksTask

Executiontime

Period

Utilization

Requiredresources

Critical speed

1 6.4 16 0.4 cpu 0.4

2 1.6 20 0.08 cpu,memory 0.4

3 1.2 12 0.1 cpu,mem,flash 0.6

4 1.08 9 0.12 cpu,mem, WNIC

0.6

Page 11: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

11

Solution to task 1 Task 1, execution time 6.4; deadline 16; utilization 0.4 Branch on four normalized speeds [0.4 0.6 0.8 1]

f: pruned by feasibility condtione: pruned by energy condition

f e e

0, 0

(1, 2.72) (0.667, 4.267) (0.5, 7.2) (0.4, 10.24)

task 1(utilization, energy)

State pruning Feasibility condition:

The 1st node at speed 0.4 removed with utilization already 1 Energy condition

Task 1 at the smallest speed (2nd , 0.6); tasks 2-4 at the max. Total Energy=7.6 (upper bound)

Task 1 at 3rd or 4th speed (0.8 or 1); tasks 2-4 at the min. The required energy exceeds 7.6. The two states can be removed

Page 12: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

12

Solution to the first three tasks

pairs of (utilization, energy)f: pruned by feasibility condtione: pruned by energy conditiond: pruned by dominance

f e e

f f

0, 0

(1, 2.72) (0.667, 4.267) (0.5, 7.2) (0.4, 10.24)

(0.867, 5.75) (0.767, 6.467) (0.747, 7.147)

(0.93, 8.47) (0.87, 9.40)(0.867, 9.107) (0.847, 9.786)

task 1

task 2

task 3f f f d

Dominance condition The states (0.867, 9.107) and (0.87, 9.4) of task 3

First one leads to smaller utilization Any feasible schedule by the second can also be

satisfied by the first First one uses less energy; the second can be removed

Page 13: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

f e e

f f

f f f e e

0, 0

1, 2.72 0.667, 4.267 0.5, 7.2 0.4, 10.24

0.867, 5.75 0.767, 6.467 0.747, 7.147

0.93, 8.47 0.87, 9.400.867, 9.107 0.847, 9.786

1.07, 10.37 0.987, 11.159

0.967, 11.84

task 1

task 2

task 3

task 4

f f f d

(utilization, energy) f: pruned by feasibility condtione: pruned by energy conditiond: pruned by dominance

optimal stateMaximum state number reduced to 6/4*4*3*3 = 0.4 %

Page 14: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

14

A fully polynomial approximation scheme (FPTAS)

State # is pseudo-polynomial in task number. can be reduced by providing approximated solutions

Approximated with worst case perf. guarantee An algorithm is said to be an approximation scheme if

for a given in (0,1), we have

A more desirable approximation scheme (FPTAS) has a polynomial running time in both the number of tasks and the performance ratio

Page 15: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

15

A fully polynomial approximation scheme (cont.)

Divide the energy values into a number of groups each of size r, Each value scaled and rounded to Energy values in the same group are treated

equally Find the group size r, subject to a given

performance bound Energy value of each task introduces an error no

larger than group size r Accumulated errors of n tasks no larger than n*r A lower bound of E* is when all tasks run at their

critical speeds (Emin), i.e., E*≥ Emin

Solving derives group size r

Page 16: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

16

Performance Evaluation Simulation Settings

A system with an PXA processor memory: standby power 0.2W, standby time 20%~60% of

task execution flash drive: 0.4W and 10%~25% wireless interface: 1W and 5%~20%

Periodic Tasks Randomly generated deadlines w/ utilization from 0.1~1 Each task randomly chooses a subset of resources Algorithms implemented

CPU-DVS, speed control for CPU energy consumption CS-DVS, a heuristic algorithm for system-wide energy

savings [Jejurikar and Gupta ISLPED2004], OPT-P, the proposed optimal solution Approximated scheme with perf. bounds 0.01, 0.1, 0.5

Page 17: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

17

Performance Evaluation (Periodic tasks)

• Energy consumption up to 16% more efficient than CS-DVS

11.11.21.31.4

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Processor Utilizations

Ener

gy C

onsu

mpt

ion CS-DVS 0.5-APPROX 0.1-APPROX

OPT-P CPU-DVS

16%

23%

8%

• Proposed algorithms 23% less energy than CPU-only solutions

• Approximation algorithms effectively bound the performance errors

Page 18: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

18

Energy Minimization for Sporadic Tasks

Online energy minimization for all uncompleted tasks

n feasible constraints under EDF

boundary constraint

On a processor with discrete speed levels Prove the problem is an instance of Multi-

dimensional MCKP (NP-hard in the strong sense, any optimal solution has exponential running time)

Page 19: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

19

J1

J2 J3

5 Time1 3 7

Consider three tasks released at time 0 with deadlines 3, 5, 7

Feasibility of a task (e.g. J2) is not affected by tasks finished later (tasks in a non-decreasing order of deadlines)

Satisfy one constraint (e.g. J3) at each iteration Can be solved by a pseudo-polynomial

algorithm for the optimal solution and an approximation scheme (FPTAS)

Sporadic Tasks (cont.)

Page 20: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

20

Performance Evaluation (Sporadic tasks)

Experimental Settings Varied number of tasks Task inter-release times generated by an

exponential dist. Algorithms implemented

TV-DVS, adaptive speed scaling for CPU energy consumption on processors w/ continuous levels [Zhong and Xu RTSS2005]

DVSST, CPU energy consumption with only frequency scaling available (continuous levels) [Qadi et al. RTSS2003]

OPT-S, the proposed optimal solution 0.1, 0.5-approximation, approximated solutions

with different performance settings

Page 21: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

21

Energy consumption (Sporadic tasks)

0.9

1.3

1.7

2.1

2.5

2.9

10 20 30 40 50 60 70 80 90 100

Number of Tasks

Ene

rgy

Con

sum

ptio

n

TV-DVS

0.5-APPROX

OPT-SDVSST

•Large task number: 23% more efficient

56%

23%

• Small task number: Energy consumption up to 56% more efficient than TVDVS and DVSST

Page 22: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

22

Conclusion System-wide energy minimization for periodic

tasks pseudo-polynomial algorithm for the optimal solution approximated solution in moderate running time with

bounded performance degradation (FPTAS) Minimization for online sporadic tasks

Pseudo-polynomial algorithm and an FPTAS by exploiting inherent properties of online task scheduling

On-going work Implementation of the policies in an embedded system

with PXA270 processor Energy/Time overhead voltage and speed switches;

overhead in putting a resource into low power state

Page 23: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

23

Thank you!

System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation

Page 24: System-Wide Energy Minimization for Real-Time Tasks: Lower Bound and Approximation Xiliang Zhong and Cheng-Zhong Xu Dept. of Electrical & Computer Engg.

24

Algorithm running time• Running time measured in a Pentium 4 machine with 2 GHz processor• OPT-P has a higher complexity than CS-DVS• Below 90 ms for systems with up to 50 tasks• All approximation algorithms require no more than 0.4 s to finish

0.01

1

100

10000

0 20 40 60 80 100Number of tasks

Alg

orith

m ru

nnin

g tim

e(s

)

OPT-P0.01-APPROX0.1-APPROX0.5-APPROXCS-DVS

0

2

4

6

10 20 30 40 50 60 70 80 90 100

Number of Tasks

Com

plex

ity in

CPU

tim

e (s

)

OPT-S

TV-DVS

0.1-APPROX

0.5-APPROX

• Algorithm running time for schedules in a 10-minutes run• OPT-S has higher running time, but <1% task execution time• Comparable time for approximation algorithms with TV-DVS