Top Banner
1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University of California, San Diego
27

1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

1

Near-Optimal Oblivious Routingfor 3D-Mesh Networks

ICCD 2008

Rohit Sunkam RamanujamBill Lin

Electrical and Computer Engineering DepartmentUniversity of California, San Diego

Page 2: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

2

Motivation: Networks-on-Chip• Chip-multiprocessors (CMPs) increasingly popular• 2D-mesh networks often used as on-chip fabric

I/O Area

I/O Area

single tile

1.5mm

2.0mm

21.7

2mm

12.64mm

65nm, 1 poly, 8 metal (Cu)Technology

100 Million (full-chip) 1.2 Million (tile)

Transistors

275mm2 (full-chip) 3mm2 (tile)

Die Area

8390C4 bumps #

65nm, 1 poly, 8 metal (Cu)Technology

100 Million (full-chip) 1.2 Million (tile)

Transistors

275mm2 (full-chip) 3mm2 (tile)

Die Area

8390C4 bumps #

Tilera Tile64Intel 80-core

Page 3: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

3

Motivation: 3D Integrated Circuits• 3D Benefits

– Reduced wire delays– Enormous bandwidth– Heterogeneous system

integration

• Natural progression– 3D-mesh for 3D CMPs

2D to 3D

Page 4: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

4

Routing Algorithm Objectives

• Maximize throughput– How much load the network can handle

• Minimize hop count– Minimize routing delay between source and destination

Page 5: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

5

Challenges• For 2D-case, a near-optimal throughput routing algorithm with

minimal hop count called O1TURN is known [Seo’05].

• Surprisingly, optimality of O1TURN does not extend to 3D case, actual throughput performance degrades severely.

• Only known optimal throughput routing algorithm is Valiant (VAL) load-balancing, but VAL performs poorly on hop count (latency), twice that of minimal routing.

Page 6: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

6

Main Contribution

• Developed a new oblivious routing algorithm called “Randomized Partially Minimal” (RPM) routing.

• RPM provably guarantees near-optimal worst-case throughput in 3D case.– Optimal for even radix k (e.g. 8 x 8 x 8 mesh).– Within factor of 1/k2 for odd radix (e.g. 7 x 7 x 7 mesh).

• Good latency performance.– Only factor of 1.33 of minimal routing (much better than 2x cost of

VAL, only known routing algorithm with optimal throughput)– In practice, 3D-meshes are asymmetric because number of device

layers less than number of tiles per edge.– e.g., for 16 x 16 x 4 mesh (4 layers), RPM’s hop count just factor of 1.1

of minimal routing.

Page 7: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

7

Outline

• Motivation for our work

Existing 2D routing algorithms don’t extend well into 3D

• RPM routing algorithm

• Simulation results

• Extensions and future work

Page 8: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

8

Existing Routing AlgorithmsThe 2D case• Dimension-Ordered Routing (DOR)

– Route minimal XY

• Valiant load-balancing (VAL)– Route source → randomly chosen

intermediate node → destination– Route minimal XY in both phases

• ROMM– Same as VAL, but intermediate node

restricted to minimal direction

• Orthogonal 1-TURN (O1TURN)– Route minimal XY and YX with equal

probability

Extending to the 3D case …• Dimension-Ordered Routing (DOR)

– Route minimal XYZ

• Valiant load-balancing (VAL)– Route source → randomly chosen

intermediate node → destination– Route minimal XYZ in both phases

• ROMM– Same as VAL, but intermediate node

restricted to minimal direction

• Orthogonal 1-TURN (O1TURN)– Route along one of 6 minimal

orthogonal paths (XYZ, XZY, YXZ, YZX, ZXY, ZYX) with equal probability

Page 9: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

9

Worst-Case Throughput

• Best theoretical normalized worst-case throughput known to be 50% (well-known result).

• Worst-case throughput analysis can be reduced to a maximal weighted matching problem [Towles’02].

• VAL achieves this optimal throughput, but has poor latency.

• As shown next, DOR, ROMM, and O1TURN are all far from optimal in 3D.

Page 10: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

10

Poor Worst-Case Throughput

Only6-15%

VAL/Optimal

Page 11: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

11

How do 2D mesh algorithms fare in 3D?

• Worst case throughput of DOR, ROMM, O1TURN far from optimal• Average hop count of VAL far from minimal• Need a routing algorithm that can trade latency for worst-case

throughput

Hop Count(normalized tominimal)

NormalizedWorst-CaseThroughputNormalizedAverage-CaseThroughput

8 x 8 x 8 NetworkVAL DOR ROMM O1TURN

0.5 0.316 0.454 0.513

VAL DOR ROMM O1TURN

2 1 1 1

0.5 0.063 0.132 0.15

Page 12: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

12

Why O1TURN performs poorly in 3D?

• O1TURN – Worst-Case throughput optimal for 2D but more than 3 times worse than optimal for 3D

• The difference – 2D traffic matrix is “admissible” for 2D mesh– In 3D, projected traffic on each 2D plane is no longer admissible !!

• Can we transform the 3D routing problem to routing admissible traffic on each 2D plane ?

Page 13: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

13

Outline

• Motivation for our work

• Existing 2D algorithms don’t extend well into 3D

RPM routing algorithm

• Simulation results

• Extensions and future work

Page 14: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

14

Randomized Partially-Minimal Routing (RPM)

Source

DestinationRandom

intermediate layer

XY or YX routing on the intermediate layer

X

YZ

Phase-1 Z Source to

intermediate layer

Phase-2 Z Intermediate

layer to destination

Page 15: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

15

Main Idea

• Load-balance uniformly across the vertical layers

• Min XY/YX used on each layer

• Main Result: RPM has near-optimal worst-case throughput

– Achieves optimal worst-case throughput when network radix k is even

– Within a factor of 1/k2 optimal when k is odd.

Page 16: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

16

RPM achieves Near-Optimal Worst Case Throughput (optimal for even radix)

VAL/Optimal

RPM

Page 17: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

17

Average-Case Throughput

• RPM outperforms VAL, DOR, ROMM and O1TURN in average-throughput on randomly generated traffic.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

4x4x4 8x8x8 16x16x4

VAL

DOR

ROMM

O1TURN

RPM

Page 18: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

18

Average Hop Count

• Normalized hop count of RPM – Symmetric Meshes - 1.33 times minimal compared to 2x

for VAL– Asymmetric 16x16x4 Mesh – 1.1 times minimal

Page 19: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

19

Outline

• Motivation for our work

• Existing 2D routing algorithms don’t extend well into 3D

• RPM routing algorithm

Simulation results

• Extensions and future work

Page 20: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

20

Flit-Level Simulation

• Ideal throughput evaluation assumes– Ideal single-cycle router– Infinite buffers– No contention in switches, no flow control

• Flit-level simulation– PopNet network simulator– 4 stage router pipeline – Route computation, VC allocation, Switch

arbitration, Link traversal– Credit-based flow control– 8 virtual channels, each 5 flits deep– Multi-flit packets injected into the network (5 flits/packet)

Page 21: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

21

Flit-Level Simulation (cont’d)• Network configurations simulated

– 4 x 4 x 4 Mesh– 8 x 8 x 8 Mesh– 16 x 16 x 4 Mesh

• Routing algorithms compared: DOR, VAL, ROMM, O1TURN, DUATO, RPM– DUATO is a minimal adaptive routing algorithm implemented for

comparison

• Four different traffic traces used– Transpose traffic – (x,y,z) → (y,z,x)– Complement traffic – (x,y,z) → (k-x-1, k-y-1, k-z-1)– Uniform traffic – Worst Case traffic pattern for DOR (DOR-WC) – (x,y,z) → (k-z-1, k-y-1, k-x-1)

Page 22: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

22

Uniform Traffic

8x8x8 Mesh 16x16x4 Mesh

Page 23: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

23

Transpose Traffic

8x8x8 Mesh 16x16x4 Mesh

Page 24: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

24

Complement Traffic

8x8x8 Mesh 16x16x4 Mesh

Page 25: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

25

DOR-WC Traffic

8x8x8 Mesh 16x16x4 Mesh

Page 26: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

26

To sum it up …

• 3D IC technology is emerging.

• Stacking cores in 3 dimensions offers several advantages over 2D placement of cores.

• 2D minimal Mesh routing algorithms have poor worst-case throughput in 3D, VAL has high latency penalty.

• RPM trades off latency (partially-minimal) for better worst case performance (near-optimal).

Page 27: 1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.

27

Thank You

Questions?