Top Banner
Parallelism and VLSI Group Prof. Dr. Jörg Keller Faculty of Mathematics and Computer Science Energy Challenges in Manycore Processors Jörg Keller
31

Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Apr 11, 2018

Download

Documents

vankhanh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Parallelism and VLSI GroupProf. Dr. Jörg Keller

Faculty of Mathematics and Computer Science

Energy Challenges inManycore Processors

Jörg Keller

Page 2: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Overview

Motivation

Power and Energy Basics

Optimization Targets

Case Study: Energy-efficient Task Scheduling

Outlook

Parallelism and VLSI GroupProf. Dr. Jörg Keller

Slide 2

Page 3: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Motivation I

Energy consumption by computers accounts for more than 1 percent of total electric energy consumption in Germany

Energy is important cost factor in computing centers- direct energy cost- cost for cooling (most energy turned into heat)

Energy limits operating time of mobile deviceslaptops, tablets, smartphones, …

Slide 3Parallelism and VLSI Group

Prof. Dr. Jörg Keller

Page 4: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Motivation II

Power consumption influences design and cost of embedded devices (fans, cooling,…)

Processors consume majority of energy in computers

Energy density might limit future processor development

Green IT has become a buzzword

SO LET‘s HAVE A LOOK INTO THIS TOPIC!

Slide 4Parallelism and VLSI Group

Prof. Dr. Jörg Keller

Page 5: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Power and Energy Basics I

Microprocessor = complex mixture of combinational circuits, registers, and memories

Parts typically built from CMOS transistors

Transistors consume energy if they switch(dynamic power)

Additionally, static power consumption e.g. from leakage current

Slide 5 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 6: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Power and Energy Basics II

Dynamic power consumption per cycle depends linear on- number of transistors that switch- frequency (length of cycle)

Energy for transistor switch depends quadratic on voltage level

Static power from leakage depends linear on voltage

For given processor:Pd ~ f*V2 Ps ~ V

Slide 6 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 7: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Power and Energy Basics III

Voltage and frequency not independent!

Minimum voltage depends linear on frequency

Minimum voltage level for given frequency preferable:frequency defines performance, voltage does not

P = cd*f3 + cs*f

Consequence: static power (low order term) often ignored

Slide 7 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 8: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Power and Energy Basics III

Voltage and frequency not independent!

Minimum voltage depends linear on frequency[As long as voltage is sufficiently higher than threshold]

Minimum voltage level for given frequency preferable:frequency defines performance, voltage does not

P = cd*f3 + cs*f

Consequence: static power (low order term) often ignored

Slide 8 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 9: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Power and Energy Basics IV

Neglected influence factor: temperature

Example: Exynox 4412 quad-core ARM chipApplication: stress benchmark under Linux

thanks to S. Holmbacka, Abo Academy

Slide 9 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 10: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Power and Energy Basics IV

Neglected influence factor: temperature

Slide 10 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 11: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Power and Energy Basics V

Energy = ∫ power dt= power * time if power constant

If (run)time is fixed and power fix over time,then power and energy optimization go hand in hand

If not, complex optimization problem

Slide 11 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 12: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Optimization Targets I

Wide range of optimization targets

Single application: minimize runtime for given energy budget

Single application: minimize energy for given max runtime

System wide: maximize battery lifetime by power reduction without hurting performance goals

Both: minimize power for given response times

Slide 12 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 13: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Optimization Targets II

Deadline vs flowtime optimization

Static vs dynamic optimization

Application often knows about future behaviour

System mostly does not

Slide 13 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 14: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Optimization Targets III

What can be done:

frequency scalingcore switch off

both?

Slide 14 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 15: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Optimization Targets IV

Simple insights:

Given task with c instructions and deadline T:run it best on frequency f=c/T

Simulate f by surrounding discrete frequencies

Check for overhead!

Switch frequency alone: often 20 cyclesSwitch frequency+voltage: often as long as 0.2 millisec

Slide 15 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 16: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Optimization Targets V

Slide 16 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 17: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Optimization Targets VI

More simple insights:

Given task with c instructions, deadline T and perfectparallelism:run it best on p cores (as many as possible) at frequencyf=c/(Tp)

Static power consumption (and minimum frequency) limitsuseful number of cores:

E(p) = T*p*(c/(Tp))3 + T*p*Ps

Slide 17 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 18: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Optimization Targets VII

For non-divisible loads:balance load as evenly as possible to minimize La-Norm

Also here:load balancing is more difficult for discrete frequencies

Slide 18 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 19: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Optimization Targets VIII

Slide 19 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 20: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Optimization Targets IX

Not so simple insights:

If static power dominates, i.e. > 50% for typical frequenciesthen run fast and shutdown cores

Fast=often not highest frequency, but next to...

But consider time scale:core shut down: 0.2 – 6 milliseccore wake up: 2 – 90 millisec(Exynox 5, depending on workload)

Slide 20 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 21: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Case Study I

Given:set of n independent tasks with loads

machine with p cores, power consumption ~fa where a>1

Deadline M

Minimize energy

Slide 21 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 22: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Case Study II

Tasks are distributed over cores

Continuous frequency scaling (Pruhs et al.):core should run at single frequencyall cores should finish at makespan timeno core idle time between tasks

Core with load Li runs at fi such that Li/fi=M, i.e. fi=Li/Menergy consumption M*fi

a = M1-a*Lia

Distribute load such that la-norm minimized known problem with known heuristic

Slide 22 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 23: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Case Study III

Discrete frequency scaling (no overhead):Solve problem for continuous frequency scaling„simulate“ frequencies by surrounding discrete frequencies

Add frequency switching overhead changes situation

Counter example for divisible load(also possible for non-divisible load, see upcoming paper)

Slide 23 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 24: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Case Study IV

No consideration of frequency scaling overhead

Slide 24 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 25: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Case Study V

Suitable heuristic frequency scaling overhead: lower energy

Slide 25 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 26: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Case Study VI

Why look at independent tasks at all?

Many applications expressable as streaming task graphs or Kahn process networks

All tasks active, intermediate results forwarded (as packets) from task to task

If communications buffered:one scheduling round = task scheduling

Energy savings per round pay off!Slide 26 Parallelism and VLSI Group

Prof. Dr. Jörg Keller

Page 27: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Case Study VII

What happens if tasks can be parallel as well? must choose degrees of parallelism for each

Theoretically optimal allocation: Sanders+Speck 2012

Simple: Eitschberger et al. 2013Parallelize all tasks or at least large ones for p cores balances load, saves energy despite efficiency loss

More advanced, still practical: Kessler et al. 2013Crown scheduling

Slide 27 Parallelism and VLSI GroupProf. Dr. Jörg Keller

Page 28: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Outlook

Constant low voltage:frequency scaling loses importance for dynamic powerstatic power gets larger fraction sleep states gain importance for scheduling and

algorithmics

Similar tendencies in other components:idle power for servers might get low enough to avoid shutdown possible reversal of strategy in computing centers

Slide 28 Parallelism and VLSI Group

Jörg Keller

Page 29: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Outlook

Microprocessors and operating systems should provide lower overhead for frequency/voltage scaling and core shut down

Better accessibility of features from applications:helpful for single-application embedded systems

Better understanding of heterogeneity and power

Slide 29 Parallelism and VLSI Group

Jörg Keller

Page 30: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Outlook

Further miniaturization might increase energy density:only fraction of device can be active, „dark silicon“ question of special vs general purpose cores might get

new fuel

Computer architecture, parallel systems and parallel algorithmics still contain lots of research possibilities

Go ahead for your next paper, PhD degree, next grant,…

Slide 30 Parallelism and VLSI Group

Jörg Keller

Page 31: Energy Challenges in Manycore Processors - Startseite · Energy density might limit future processor development ... see upcoming paper) Slide 23 Parallelism and VLSI Group Prof.

Faculty of Mathematics and Computer Science

Energy Challengesin ManycoreProcessors

Thanks

for your

kind attention!

Slide 31 Parallelism and VLSI Group

Jörg Keller