Top Banner
Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwi n Microsystems Design Lab, Department of CSE The Pennsylvania State University [email protected]
52

Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling

Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin

Microsystems Design Lab, Department of CSE

The Pennsylvania State University

[email protected]

Page 2: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 2

Why NoCs? Scalability

Support for large number of processing units Flexibility

Topology and routing policy can be configured according to the needs of a particular application Point-to-point, broadcasting (one-to-multiple), gathering (multiple-

to-one)

Performance Low latency, high bandwidth

Reliability Multiple routes between a source/target pair Signal strengthening in routers

Page 3: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 3

Mesh-Based NoC Abstraction

CPU

Memory

CPU

Memory

CPU

Memory

CPU

Memory

CPU

Memory

CPU

Memory

CPU

Memory

CPU

Memory

CPU

Memory

Communication Channel

Router

Page 4: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 4

Related Work Communication channels can account for a significant portion to the chip

energy consumption (between 20% and 45%) Prior efforts

Simunic and Boyd: NoC power modeling (DATE’02) Benini and De Micheli: Design methodology for energy-efficient reliable SoC

networks (ISSS’01) Shang et al: Hardware-directed DVS for communication links (HPCA’03) Kim et al: Communication link shutdown (ISLPED’03) Soteriou and Peh: Design space exploration for link turn on/off (ICCD’04) Soteriou et al: Software-directed power-aware interconnection networks

(CASES’05) Li et al: Software-directed DVS for communication links (CASES’05) Li et al: Compiler-directed link turnoff and routing (ICCAD’05, EMSOFT’05,

POPL’06) Our goal is to save network energy through voltage/frequency scaling

Page 5: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 5

Motivational Example (1)

for i = 0 to N { send(2, A[i][0..1023] receive(2, buffer)}

for i = 0 to N{ send(1, A[i][0..255] receive(1, buffer)}

Node 1 Node 2

i=0 i=1 i=2 i=3 i=4

Page 6: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 6

Motivational Example (2)

for i = 0 to N { send(2, A[i][0..255] short computation receive(2, buffer)}

for i = 0 to N{ send(1, A[i][0..255] long computation receive(1, buffer)}

Node 1 Node 2

i=0 i=1 i=2 i=3 i=4

Node 1

Node 2

Node 1

Node 2

Page 7: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 7

Overview of Our Approach

InputParallelCode

IPCG

Scaling Factorfor Each

Connection

OutputParallelCode

BuildingIPCG

CriticalPath

Analysis

CodeModification

•Process and Connection Mapping•NoC Parameters

Page 8: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 8

Assumptions Array-based embedded applications Message-passing based parallel program

For each send(p, m) instruction, the destination node p, and the size of message m can be statically determined at compilation time

For each receive(p, m) instruction, the source node p can be determined at compilation time

A send instruction is blocked if the previous message send by the same node has not been delivered to the destination node

A receive instruction is blocked if the message is not ready in the buffer of the receiver node

Code is parallelized and process-to-node mapping is performed

Network is exposed to the compiler

Page 9: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 9

Inter-Process Communication Graph (IPCG) IPCG G(P) captures the communication behavior of

application P G(P) = (V(P), E(P), , )

V(P): the set of vertices E(P): the set of edges , : the weights for edges, capturing minimum/maximum

execution latencies

Page 10: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 10

Vertices of IPCG V(P) = X(P) B(P) S(P) D(P) R(P)

x X(P): the entry point of a loop in program P b B(P): the back jump of a loop in program P s S(P): the point in P at which a message is sent d D(P): the point in P at which a message is delivered r R(P): the point in P at which a message is used

Node 1

Node 2

send(2,..)

receive(1,..)

s

d rmessagedelivered

Page 11: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 11

Edges of IPCG Task edges

Communication edge (s, d): a message is sent at point s S(P) and delivered at point d D(P)

Computation edge (u, v): a computation task starts at point u and ends at point v u, v X(P) S(P) R(P)

Control edges Enforce the order at which the points of the given

program can be reached Back-jump edge Other control edges

Page 12: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 12

and Functions (u,v) and (u,v): the minimum and maximum times

required to execute task (u,v) For communication edge (s,d)

(s,d) = (min. message size) / (max. data rate) (u,v) = (max. message size) / (max. data rate)

For computation edge (u, v) (s,d) = the minimum time for executing the instructions between

u and v (u,v) = the maximum time for executing the instructions between

u and v For control edge(u,v)

(s,d) = (u,v) = 0

Page 13: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 13

IPCG Example (1)

// Process 1x3:for(...) { r1:receive(2,..) 20–25 cycles s2:send(2,..)}

// Process 2x1:for(...) { s1:send(1,..); x2:for(...) { 10 cycles s3:send(3,..); 10–15 cycles s4:send(3,..); 80-90 cycles r5:receive(3,..) 20 cycles } r2:receive(1,..);}

// Process 3x4:for(...) { 10 cycles r3:receive(2,..) 15 cycles r4:receive(2,..) 40-50 cycles s5:send(2,..)}

Page 14: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 14

IPCG Example (2)

0/0

0/0

0/0

120/

x2

s4

r5

x1

s1

r2

r4

s5

x4

r1

s2

x3

0/0

20/25

0/0

15/15

40/50

0/0

10/10

80/90

20/20

10/15

10/10

10/15

10/10

p1 p2 p3

b3

b1

b2b4

s3 r3

d1

d2

10/15

d3

10/10

d4

d5

10/10

Page 15: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 15

IPCG Example (2)

x2

s4

r5

x1

s1

r2

r4

s5

x4

r1

s2

x3

p1 p2 p3

b3

b1

b2b4

s3 r3

d1

d2

d3

d4

d5

Page 16: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 16

IPCG Example (2)

x2

s4

r5

x1

s1

r2

r4

s5

x4

r1

s2

x3

p1 p2 p3

b3

b1

b2b4

s3 r3

d1

d2

d3

d4

d5

Page 17: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 17

IPCG Example (2)

x2

s4

r5

x1

s1

r2

r4

s5

x4

r1

s2

x3

p1 p2 p3

b3

b1

b2b4

s3 r3

d1

d2

d3

d4

d5

Page 18: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 18

IPCG Example (2)

x2

s4

r5

x1

s1

r2

r4

s5

x4

r1

s2

x3

p1 p2 p3

b3

b1

b2b4

s3 r3

d1

d2

d3

d4

d5

Page 19: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 19

IPCG Example (2)

x2

s4

r5

x1

s1

r2

r4

s5

x4

r1

s2

x3

10/15

10/10

10/15

10/10

p1 p2 p3

b3

b1

b2b4

s3 r3

d1

d2

d3

10/10

d4

d5

Page 20: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 20

IPCG Example (2)

0/0

0/0

0/0

120/

x2

s4

r5

x1

s1

r2

r4

s5

x4

r1

s2

x3

0/0

20/25

0/0

15/15

40/50

0/0

10/10

80/90

20/20

10/15

10/10

10/15

10/10

p1 p2 p3

b3

b1

b2b4

s3 r3

d1

d2

10/15

d3

10/10

d4

d5

10/10

Page 21: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 21

IPCG Example (2)

0/0

0/0

0/0

120/

x2

s4

r5

x1

s1

r2

r4

s5

x4

r1

s2

x3

0/0

20/25

0/0

15/15

40/50

0/0

10/10

80/90

20/20

10/15

10/10

10/15

10/10

p1 p2 p3

b3

b1

b2b4

s3 r3

d1

d2

10/15

d3

10/10

d4

d5

10/10

Page 22: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 22

IPCG Example (2)

0/0

0/0

0/0

120/

x2

s4

r5

x1

s1

r2

r4

s5

x4

r1

s2

x3

0/0

20/25

0/0

15/15

40/50

0/0

10/10

80/90

20/20

10/15

10/10

10/15

10/10

p1 p2 p3

b3

b1

b2b4

s3 r3

d1

d2

10/15

d3

10/10

d4

d5

10/10

Page 23: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 23

IPCG Example (2)

0/0

0/0

0/0

120/

x2

s4

r5

x1

s1

r2

r4

s5

x4

r1

s2

x3

0/0

20/25

0/0

15/15

40/50

0/0

10/10

80/90

20/20

10/15

10/10

10/15

10/10

p1 p2 p3

b3

b1

b2b4

s3 r3

d1

d2

10/15

d3

10/10

d4

d5

10/10

Page 24: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 24

IPCG Example (2)

0/0

0/0

0/0

120/

x2

s4

r5

x1

s1

r2

r4

s5

x4

r1

s2

x3

0/0

20/25

0/0

15/15

40/50

0/0

10/10

80/90

20/20

10/15

10/10

10/15

10/10

p1 p2 p3

b3

b1

b2b4

s3 r3

d1

d2

10/15

d3

10/10

d4

d5

10/10

Page 25: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 25

Parallel Loop Group A set of loops that communicate with each other Unit of granularity for optimization

0/0

0/0

0/0

120/

x2

s4

r5

x1

s1

r2

r4

s5

x4

r1

s2

x3

0/0

20/25

0/0

15/15

40/50

0/0

10/10

80/90

20/20

10/15

10/10

10/15

10/10b3

b1

b2b4

s3 r3

d1

d2

10/15

d3

10/10

d4

d5

10/10

Page 26: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 26

Representative Iterations A set of loop iterations that represent the timing

behavior of the entire parallel loop group

T T

t1,0

t2,0

t3,0

t4,0

j = 0t1,1

t2,1

t3,1

t4,1

j = 1t1,2

t2,2

t3,2

t4,2

j = 2t1,3

t2,3

t3,3

t4,3

j = 3t1,4

t2,4

t3,4

t4,4

j = 4t1,5

t2,5

t3,5

t4,5

j = 5t1,6

t2,6

t3,6

t4,6

j = 6t1,7

t2,7

t3,7

t4,7

j = 7t1,8

t2,8

t3,8

t4,8

j = 8

Time

Loop x1

Loop x2

Loop x3

Loop x4

q = 1 Q = 4

R = 3 Tttqj Rjiji ,,: Tttqj Rjiji ,,:

Page 27: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 27

Critical Path Analysis Determine q and Q such that [q, Q – 1] are the set of

representative loop iterations Determine t[i,j]: the earliest time that node vi at the jth

iteration (j [q, Q-1]) can be reached, assuming each task is completed in the shortest time

Determine t[i,j]: the earliest time that node vi at the jth iteration (j [q, Q-1]) can be reached, assuming each task takes the longest time

Determine the scaling factor for each communication channel such that the overall performance degradation due to voltage scaling is within (a preset bound)

Page 28: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 28

Determining t[i,j] - Constraints

]1,[],[:),(

),(],[],[:),(

0]0,[:

jktjitEik

ikjktjitEik

iti

whereQj 0

E

E

: the set of intra-iteration edges

: the set of inter-iteration edges

Evu ),( : at each iteration j, u must be reached before v

Evu ),( : u at the (j – 1)th iteration must be reached before v at the jth iteration

Page 29: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 29

Examples of Intra- and Inter-Iteration Edges

x2

s4

r5

x1

s1

r2

r4

s5

x4

r1

s2

x3

p1 p2 p3

b3

b1

b2b4

s3 r3

d1

d2

d3

d4

d5

Intra-Iteration edge Inter-Iteration edge

Page 30: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 30

Determining t[i,j] - Example

20/2520/2520/25

s1

r1

x1

20/25

10/10b1

s2

r2

x2

25/30

25/30b2

s3

r3

x3

20/20

15/15b3

p1 p2 p3

d1 d1 d3

Page 31: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 31

Determining t[i,j] - Example

x1 s1 d1 r1 b1 x2 s2 d2 r2 b2 x3 s3 d3 r3 b3

t[i,0] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

t[s1,0] + (s1, d1) t[d1, 0] 0 + 20 = 20

20

Page 32: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 32

Determining t[i,j] - Example

x1 s1 d1 r1 b1 x2 s2 d2 r2 b2 x3 s3 d3 r3 b3

t[i,0] 0 0 20 20 30 0 0 20 25 50 0 0 20 20 35

t[i,1] 30 20 0 0 0 20 50 0 0 0 35 20 0 0 0

Page 33: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 33

Determining t[i,j] - Example

x1 s1 d1 r1 b1 x2 s2 d2 r2 b2 x3 s3 d3 r3 b3

t[i,0] 0 0 20 20 30 0 0 20 25 50 0 0 20 20 35

t[i,1] 30 30 50 55 65 50 50 70 75 100 35 35 55 70 85

Page 34: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 34

Determining t[i,j] – Example

x1 s1 d1 r1 b1 x2 s2 d2 r2 b2 x3 s3 d3 r3 b3

t[i,0] 0 0 20 20 30 0 0 20 25 50 0 0 20 20 35

t[i,1] 30 30 50 55 65 50 50 70 75 100 35 35 55 70 85

t[i,2] 65 65 85 105 115 100 100 120 125 150 85 85 105 120 135

t[i,3] 115 115 135 155 165 150 150 170 175 200 135 135 155 170 185

t[i,4] 165 .... .... .... .... 200 .... .... .... .... 185 .... .... .... ....

q = 2, Q = 4, T = 50q = 2, Q = 4, T = 50

Page 35: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 35

Determining t[i,j] - Constraints

]1,[],[:),(

),(],[],[:),(

],[],[:

jktjitEik

ikjktjitEik

qitqiti

whereQj 0

EE

: the set of intra-iteration edges

: the set of inter-iteration edges

Page 36: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 36

Determining Scaling Factor -Constraints

]},[],[,)1max{(],[],[:

]1,[],[:),(

)](),([/),(],[],[:),(

],[],[:

qitQitTqitQiti

jktjitEik

vvkikjktjitEik

qitqiti

ik

where Qj 0 EE , : the set of intra-iteration and inter-iteration edges)(v : the node that executes operation v

),( 21 nnk : the scaling factor for the network connection from node n1 to n2

We try to maximize k(n1, n2) for each connection

1),(0 21 nn

: the maximum performance degradation allowed

Page 37: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 37

Determining Scaling Factor - Algorithmrepeat

select a connection Cscale down the data rate of C by one gradedetermine t[i, j] using

if make the data rate of C permanent

else restore the data rate of C

until no more connection can be scale down

]1,[],[:),(

)](),([/),(],[],[:),(

],[],[:

jktjitEik

vvkikjktjitEik

qitqiti

ik

]},[],[,)1max{(],[],[: qitQitTqitQiti

Page 38: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 38

Determining Scaling Factor - Example

x1 s1 d1 r1 b1 x2 s2 d2 r2 b2 x3 s3 d3 r3 b3

t[i,q] 65 65 85 105 115 100 100 120 125 150 85 85 105 120 135

t[i,Q] 165 .... .... .... .... 200 .... .... .... .... 185 .... .... .... ....

t[i,Q] 170 .... .... .... .... 210 .... .... .... .... 190 .... .... .... ....

tmax[i,Q] 175 .... .... .... .... 210 .... .... .... .... 195 .... .... .... ....

q = 2, Q = 4, T = 100, = 10%, k = 1, 0.8, 0.6, 0.4, 0.2

Page 39: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 39

Determining Scaling Factor - Example

x1 s1 d1 r1 b1 x2 s2 d2 r2 b2 x3 s3 d3 r3 b3

t[i,q] 65 65 85 105 115 100 100 120 125 150 85 85 105 120 135

t[i,Q] 165 .... .... .... .... 200 .... .... .... .... 185 .... .... .... ....

t[i,Q] 170 .... .... .... .... 210 .... .... .... .... 190 .... .... .... ....

tmax[i,Q] 175 .... .... .... .... 210 .... .... .... .... 195 .... .... .... ....

q = 2, Q = 4, T = 100, = 10%, k = 1, 0.8, 0.6, 0.4, 0.2

k[1, 2] = 0.8, k[2, 3] = 1, k[3, 1] = 1

x1 s1 d1 r1 b1 x2 s2 d2 r2 b2 x3 s3 d3 r3 b3

t[i,Q] 170 .... .... .... .... 210 .... .... .... .... 190 .... .... .... ....

Page 40: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 40

Determining Scaling Factor - Example

x1 s1 d1 r1 b1 x2 s2 d2 r2 b2 x3 s3 d3 r3 b3

t[i,q] 65 65 85 105 115 100 100 120 125 150 85 85 105 120 135

t[i,Q] 165 .... .... .... .... 200 .... .... .... .... 185 .... .... .... ....

t[i,Q] 170 .... .... .... .... 210 .... .... .... .... 190 .... .... .... ....

tmax[i,Q] 175 .... .... .... .... 210 .... .... .... .... 195 .... .... .... ....

q = 2, Q = 4, T = 100, = 10%, k = 1, 0.8, 0.6, 0.4, 0.2

k[1, 2] = 0.8, k[2, 3] = 0.8, k[3, 1] = 1

x1 s1 d1 r1 b1 x2 s2 d2 r2 b2 x3 s3 d3 r3 b3

t[i,Q] 170 .... .... .... .... 210 .... .... .... .... 196.25 .... .... .... ....

Page 41: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 41

Determining Scaling Factor - Example

x1 s1 d1 r1 b1 x2 s2 d2 r2 b2 x3 s3 d3 r3 b3

t[i,q] 65 65 85 105 115 100 100 120 125 150 85 85 105 120 135

t[i,Q] 165 .... .... .... .... 200 .... .... .... .... 185 .... .... .... ....

t[i,Q] 170 .... .... .... .... 210 .... .... .... .... 190 .... .... .... ....

tmax[i,Q] 175 .... .... .... .... 210 .... .... .... .... 195 .... .... .... ....

q = 2, Q = 4, T = 100, = 10%, k = 1, 0.8, 0.6, 0.4, 0.2

k[1, 2] = 0.8, k[2, 3] = 1, k[3, 1] = 0.8

x1 s1 d1 r1 b1 x2 s2 d2 r2 b2 x3 s3 d3 r3 b3

t[i,Q] 176.25 .... .... .... .... 210 .... .... .... .... 190 .... .... .... ....

Page 42: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 42

Determining Scaling Factor - Example

x1 s1 d1 r1 b1 x2 s2 d2 r2 b2 x3 s3 d3 r3 b3

t[i,q] 65 65 85 105 115 100 100 120 125 150 85 85 105 120 135

t[i,Q] 165 .... .... .... .... 200 .... .... .... .... 185 .... .... .... ....

t[i,Q] 170 .... .... .... .... 210 .... .... .... .... 190 .... .... .... ....

tmax[i,Q] 175 .... .... .... .... 210 .... .... .... .... 195 .... .... .... ....

q = 2, Q = 4, T = 100, = 10%, k = 1, 0.8, 0.6, 0.4, 0.2

k[1, 2] = 0.6, k[2, 3] = 1, k[3, 1] = 1

x1 s1 d1 r1 b1 x2 s2 d2 r2 b2 x3 s3 d3 r3 b3

t[i,Q] 170 .... .... .... .... 210 .... .... .... .... 190 .... .... .... ....

Page 43: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 43

Determining Scaling Factor - Example

x1 s1 d1 r1 b1 x2 s2 d2 r2 b2 x3 s3 d3 r3 b3

t[i,q] 65 65 85 105 115 100 100 120 125 150 85 85 105 120 135

t[i,Q] 165 .... .... .... .... 200 .... .... .... .... 185 .... .... .... ....

t[i,Q] 170 .... .... .... .... 210 .... .... .... .... 190 .... .... .... ....

tmax[i,Q] 175 .... .... .... .... 210 .... .... .... .... 195 .... .... .... ....

q = 2, Q = 4, T = 100, = 10%, k = 1, 0.8, 0.6, 0.4, 0.2

k[1, 2] = 0.4, k[2, 3] = 1, k[3, 1] = 1

x1 s1 d1 r1 b1 x2 s2 d2 r2 b2 x3 s3 d3 r3 b3

t[i,Q] 170 .... .... .... .... 210 .... .... .... .... 190 .... .... .... ....

Page 44: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 44

Determining Scaling Factor - Example

x1 s1 d1 r1 b1 x2 s2 d2 r2 b2 x3 s3 d3 r3 b3

t[i,q] 65 65 85 105 115 100 100 120 125 150 85 85 105 120 135

t[i,Q] 165 .... .... .... .... 200 .... .... .... .... 185 .... .... .... ....

t[i,Q] 170 .... .... .... .... 210 .... .... .... .... 190 .... .... .... ....

tmax[i,Q] 175 .... .... .... .... 210 .... .... .... .... 195 .... .... .... ....

q = 2, Q = 4, T = 100, = 10%, k = 1, 0.8, 0.6, 0.4, 0.2

k[1, 2] = 0.2, k[2, 3] = 1, k[3, 1] = 1

x1 s1 d1 r1 b1 x2 s2 d2 r2 b2 x3 s3 d3 r3 b3

t[i,Q] 170 .... .... .... .... 210 .... .... .... .... 190 .... .... .... ....

Page 45: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 45

Determining Scaling Factor - Example

x1 s1 d1 r1 b1 x2 s2 d2 r2 b2 x3 s3 d3 r3 b3

t[i,q] 65 65 85 105 115 100 100 120 125 150 85 85 105 120 135

t[i,Q] 165 .... .... .... .... 200 .... .... .... .... 185 .... .... .... ....

t[i,Q] 170 .... .... .... .... 210 .... .... .... .... 190 .... .... .... ....

tmax[i,Q] 175 .... .... .... .... 210 .... .... .... .... 195 .... .... .... ....

q = 2, Q = 4, T = 100, = 10%, k = 1, 0.8, 0.6, 0.4, 0.2

k[1, 2] = 0.2, k[2, 3] = 1, k[3, 1] = 1

x1 s1 d1 r1 b1 x2 s2 d2 r2 b2 x3 s3 d3 r3 b3

t[i,Q] 170 .... .... .... .... 270 .... .... .... .... 190 .... .... .... ....

RESULT: k[1, 2] = 0.4, k[2, 3] = 1, k[3, 1] = 1

Page 46: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 46

Shared Communication Channels

The voltage level of the channel shared by multiple connections is determined by the connection that requires the highest voltage level

a c

b b

c a

]]',[[and]]',[[ sconnectionby shared bbaa

]]',[[and]]',[[ sconnectionby shared ccaa

v1

v1

v2

v3

v2 v2

v3

v3

v1

v1

Page 47: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 48

Experimental Setup

Parameter Value

NoC topology 5 * 5 mesh

Idle channel power 8.6pJ/cycle

Voltage switch energy 1020pJ,

Voltage delay 120 cycles

Processor 1GHz, 2-issue

Node local memory 20KB

Package header size 3 flits

Flit size 39bits

Voltage

(V)

Rate

(bps)

Energy

(pJ/bit)

0.7 200M 4.21

0.9 660M 5.25

1.1 1.33G 6.49

1.3 1.93G 8.31

1.5 2.50G 10.21

Page 48: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 49

Impact on Energy Consumption

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%M

orp

h2

Dis

c

Jp

eg

Vit

erb

i

Rasta

3S

tep

-lo

g

Fu

ll-s

earc

h

Hie

r

Ph

od

s

Ep

ic

Lam

e

FF

T

No

rmali

zed

En

erg

y C

on

su

mp

tio

n

Hardware Scheme Compiler Scheme Optimal

Page 49: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 50

Energy Consumption Breakdown

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Mo

rph

2

Dis

c

Jp

eg

Vit

erb

i

Rasta

3S

tep

-lo

g

Fu

ll-s

earc

h

Hie

r

Ph

od

s

Ep

ic

Lam

e

FF

T

En

erg

y B

reakd

ow

n

1.5V 1.3V 1.1V 0.9V 0.7V overhead

Page 50: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 51

Accuracy of Voltage Selection

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Mo

rph

2

Dis

c

Jp

eg

Vit

erb

i

Rasta

3S

tep

-lo

g

Fu

ll-s

earc

h

Hie

r

Ph

od

s

Ep

ic

Lam

e

FF

TBre

akd

ow

n o

f A

ccu

racy i

n V

olt

ag

e S

ele

cti

on

<= -2 -1 0 +1 >= +2

Page 51: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

PLDI’06 52

Conclusions and Research Directions

NoC presents unique opportunities for compilers Expose network layout to compiler for energy reduction

through voltage scaling and channel shutdown We implemented a compiler directed voltage

scaling algorithm and compared its performance to a hardware scheme Promising results

Research Directions Evaluating impact of process-to-node mapping Combined voltage/frequency scaling for NoC and CPUs Metrics other than energy (e.g., temperature, reliability,

…)

Page 52: Reducing NoC Energy Consumption Through Compiler-Directed Channel Voltage Scaling Guangyu Chen, Feihui Li, Mahmut Kandemir, Mary Jane Irwin Microsystems.

Thank you!http://www.cse.psu.edu/~mdl

[email protected]

Funded in part byGSRC and NSF