Top Banner
Scaling Internet Routers Using Optics UW, October 16 th , 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard. Students: Isaac Keslassy, Shang-Tse Chuang, Kyoungsik Yu. Department of Electrical Engineering, Stanford University Paper: http://klamath.stanford.edu/~nickm/papers/sigcomm2003.pdf Web site: http://klamath.stanford.edu/or
50

Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Scaling Internet Routers Using Optics

UW, October 16th, 2003

Nick McKeown

Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard. Students: Isaac Keslassy, Shang-Tse Chuang, Kyoungsik Yu.

Department of Electrical Engineering, Stanford University

Paper: http://klamath.stanford.edu/~nickm/papers/sigcomm2003.pdfWeb site: http://klamath.stanford.edu/or

Page 2: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Backbone router capacity

1986 1988 1990 1992 1994 1996 1998 2000 2002 2004

1Tb/s

1Gb/s

10Gb/s

100Gb/s

Router capacity per rack2x every 18 months

Page 3: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Backbone router capacity

1986 1988 1990 1992 1994 1996 1998 2000 2002 2004

1Tb/s

1Gb/s

10Gb/s

100Gb/s

Router capacity per rack2x every 18 months

Traffic2x every year

Page 4: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Extrapolating

2003 2005 2007 2009 2011 2013 2015

1Tb/s

Router capacity2x every 18 months

Traffic2x every year

100Tb/s 2015: 16x disparity

Page 5: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Consequence

Unless something changes, operators will need: 16 times as many routers, consuming 16 times as much space, 256 times the power, Costing 100 times as much.

Actually need more than that…

Page 6: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Stanford 100Tb/s Internet Router

Goal: Study scalability Challenging, but not impossible Two orders of magnitude faster than deployed routers We will build components to show feasibility

40Gb/s

40Gb/s

40Gb/s

40Gb/s

OpticalOpticalSwitchSwitch

• Line termination

• IP packet processing

• Packet buffering

• Line termination• IP packet processing

• Packet buffering

Electronic

Linecard #1Electronic

Linecard #1ElectronicLinecard #625

ElectronicLinecard #625160-

320Gb/s

160Gb/s

160-320Gb/s

100Tb/s = 640 * 160Gb/s

Page 7: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Throughput Guarantees

Operators increasingly demand throughput guarantees: To maximize use of expensive long-haul links For predictability and planning

Despite lots of effort and theory, no commercial router today has a throughput guarantee.

Page 8: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Requirements of our router

100Tb/s capacity 100% throughput for all traffic Must work with any set of linecards present Use technology available within 3 years Conform to RFC 1812

Page 9: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

What limits router capacity?

0

2

4

6

8

10

12

1990 1993 1996 1999 2002 2003

Power

(kW

)

Approximate power consumption per rack

Power density is the limiting factor today

Page 10: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Crossbar

Linecards

Switch Linecards

Trend: Multi-rack routersReduces power density

Page 11: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Alcatel 7670 RSP Juniper TX8/T640

TX8

ChiaroAvici TSR

Page 12: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Limits to scaling

Overall power is dominated by linecards Sheer number Optical WAN components Per packet processing and buffering.

But power density is dominated by switch fabric

Page 13: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Trend: Multi-rack routersReduces power density

Switch Linecards

Limit today ~2.5Tb/s Electronics Scheduler scales <2x every 18 months Opto-electronic conversion

Page 14: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

In

OutWAN

Linecard

InWAN

Multi-rack routers

Out

Switch fabric

Page 15: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Question

Instead, can we use an optical fabric at 100Tb/s with 100% throughput?

Conventional answer: No. Need to reconfigure switch too often 100% throughput requires complex

electronic scheduler.

Page 16: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Outline

How to guarantee 100% throughput? How to eliminate the scheduler? How to use an optical switch fabric? How to make it scalable and practical?

Page 17: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

In

In

In

Out

Out

Out

R

R

R

R

R

R

Router capacity = NRSwitch capacity = N2R

100% Throughput?

?

?

?

?

?

?

?

?

R

R

R

R

R

R

R

R

R

RRRR

Page 18: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

R

In

In

In

Out

Out

Out

R

R

R

R

R

R/N

R/N

R/N

R/NR/N

R/N

R/N

R/N

R/N

If traffic is uniform

RNR /NR /NR /

R

NR / NR /

Page 19: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Real traffic is not uniform

R

In

In

In

Out

Out

Out

R

R

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

RNR /NR /NR /

R

RNR /NR /NR /

R

RNR /NR /NR /

R

R

R

R

?

Page 20: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

Two-stage load-balancing switch

Load-balancing stage Switching stage

In

In

In

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R

R

R

100% throughput for weakly mixing, stochastic traffic.[C.-S. Chang, Valiant]

Page 21: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

In

In

In

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

33 1

2

3

3333

Page 22: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

In

In

In

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R/N33

1

2

3

33

33

Page 23: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Chang’s load-balanced switchGood properties

1. 100% throughput for broad class of traffic

1. No scheduler needed

Scalable

Page 24: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Chang’s load-balanced switchBad properties

FOFF: Load-balancing algorithm Packet sequence maintained No pathological patterns 100% throughput - always Delay within bound of ideal (See paper for details)

FOFF: Load-balancing algorithm Packet sequence maintained No pathological patterns 100% throughput - always Delay within bound of ideal (See paper for details)

1. Packet mis-sequencing

2. Pathological traffic patterns

Throughput 1/N-th of capacity

3. Uses two switch fabrics

Hard to package

4. Doesn’t work with some linecards missing

Impractical

Page 25: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

In

In

In

Out

Out

Out

R

R

R

R

R

R

2R/N

2R/N

2R/N

2R/N

2R/N

2R/N

2R/N

2R/N

2R/N

Single Mesh Switch

One linecard

Page 26: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

In

In

In

R

R

R

Out

Out

Out

Backplane

R

R

R

Packaging2R/N

2R/N

2R/N

2R/N

2R/N

2R/N

2R/N

2R/N

R/N

Page 27: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Many fabric options

OptionsSpace: Full uniform meshTime: Round-robin crossbarWavelength: Static WDM

Any permutationnetwork

C1, C2, …, CN

C1

C2

C3

CN

In Out

In Out

In Out

In Out

N channels each at rate 2R/N

Page 28: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

In Out

In Out

In Out

In Out

Static WDM switching

Array Waveguide

Router(AWGR)

Passive andAlmost Zero

Power

A

B

C

D

A, B, C, D

A, B, C, D

A, B, C, D

A, B, C, D

A, A, A, A

B, B, B, B

C, C, C, C

D, D, D, D

4 WDM channels, each at rate 2R/N

Page 29: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

RWDM

1

N

ROut

WDM

1

N

WDM

1

N

R R

2

R

R

4

21

Linecard dataflow

WDM

1

N

22

22

22

22 22 22

11

33

11

11

11111111

R R

3

In

11 11 11 11

Page 30: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Problems of scale

For N < 64, WDM is a good solution. We want N = 640. Need to decompose.

Page 31: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Decomposing the mesh

2R/81

2

3

4

5

6

7

8

1

2

3

4

5

6

7

8

Page 32: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Decomposing the mesh

2R/42R/8

2R/8

2R/8

2R/8

1

2

3

4

5

6

7

8

1

2

3

4

5

6

7

8

TDMWDM

Page 33: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

When N is too largeDecompose into groups (or racks)

1, 2, …, G

1

Array Waveguide

Router(AWGR)

2

L

2R2R

2R

1

2

L

2R2R

2R

Group/Rack 1

Group/Rack G

1

G1, 2, …, G

Page 34: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

When a linecard is missing

Each linecard spreads its data equally over every other linecard.

Problem: If one is missing, or failed, then the spreading no longer works.

Page 35: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

When a linecard fails

In

In

Out

Out

Out

R

R

R

2R/3

2R/3

2R/3

2R/32R/32R/3

2R/3

2R/3

2R/3

InR

R

R

2R/3 + 2R/6

2R/3 + 2R/6

2R/3 + 2R/6 + 2R/3 + 2R/6 = 2R

2R/3 + 2R/6

2R/3 + 2R/6

Solution:1. Move light beams

Replace AWGR with MEMS switch. Reconfigure when linecard added, removed or

fails.2. Finer channel granularity

Multiple paths.

2R/3 + 2R/3 = (4/3)R

Page 36: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

SolutionUse transparent MEMS switches

1

2

L

2R2R

2R

1

2

L

2R2R

2R

Group/Rack 1

Group/Rack G=40

MEMSSwitch

1

G

MEMSSwitch

1

G

MEMSSwitch

1

G

Theorems: 1. Require L+G-1 MEMS switches2. Polynomial time reconfiguration algorithm

MEMS switches reconfigured only when linecard added, removed or fails.

Page 37: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

First-Stage

GxGMiddleSwitch

Group 1

LxMLocal

Switch

Linecard 1

Linecard 2

Linecard L

Group 2

LxMLocal

Switch

Linecard 1

Linecard 2

Linecard L

LxMLocal

Switch

Linecard 1

Linecard 2

Linecard L

Group G

MxLLocal

Switch

Linecard 1

Linecard 2

Linecard L

Final-Stage

Group 1

MxLLocal

Switch

Linecard 1

Linecard 2

Linecard L

Group 2

MxLLocal

Switch

Linecard 1

Linecard 2

Linecard L

Group G

GxGMiddleSwitch

GxGMiddleSwitch

GxGMiddleSwitch

1

2

3

M

Middle-Stage

1

2

3

M

1

2

3

M

1

2

3

M

1

2

3

M

1

2

3

M

1

2

3

M

Hybrid Architecture: Logical View

Page 38: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Hybrid Electro-Optical ArchitectureFixedLasers

ElectronicSwitches

GxGMEMS

Group 1

LxMCrossbar

Linecard 1

Linecard 2

Linecard L

Group 2

LxMCrossbar

Linecard 1

Linecard 2

Linecard L

LxMCrossbar

Linecard 1

Linecard 2

Linecard L

Group G

MxLCrossbar

Linecard 1

Linecard 2

Linecard L

ElectronicSwitches

OpticalReceivers

Group 1

MxLCrossbar

Linecard 1

Linecard 2

Linecard L

Group 2

MxLCrossbar

Linecard 1

Linecard 2

Linecard L

Group G

GxGMEMS

GxGMEMS

GxGMEMS

1

2

3

M

StaticMEMS

1

2

3

M

1

2

3

M

1

2

3

M

1

2

3

M

1

2

3

M

1

2

3

M

Page 39: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Number of MEMS Switches

Linecard 1

Linecard 2

Linecard 3

Crossbar

Crossbar

Crossbar

Crossbar

Linecard 1

Linecard 2

Linecard 3

R

RR

R

Linecard 1

Linecard 2

Linecard 3

Crossbar

Crossbar

Crossbar

Crossbar

Linecard 1

Linecard 2

Linecard 3

StaticMEMS

Linecard 4 Linecard 4

Linecard 3 Linecard 4

R

R

R

R

R

R

R

R

R

R

R

R

R

R

R

R

R

R

R

R

Page 40: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Number of MEMS SwitchesLinecard 1

Linecard 2

Linecard 3

Crossbar

Crossbar

Crossbar

Crossbar

Linecard 1

Linecard 2

Linecard 3

4R/3

2R/32R/3

R/3

Linecard 1

Linecard 2

Linecard 3

Crossbar

Crossbar

Crossbar

Crossbar

Linecard 1

Linecard 2

Linecard 3

StaticMEMS

R

R/3

2R/3

R/3

2R/3

R

R

R

R

R

R

R

R

R

R

R

R

Page 41: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Number of MEMS needed for a schedule

Li: number of linecards in group i, 1 ≤ i ≤ G. Group i needs to send to group j:

1

( )( ), G

ji i

i

LL R where N L

N

Assume each group can send at most R to each MEMS. Number of MEMS

needed between groups i and j:

1( )( ) j i j

ij i

L L LA L R

N R N

Page 42: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Number of MEMS needed for a schedule

The number of MEMS needed for group i to send to group j is Aij.

The total number of MEMS needed for group i is the sum of the Aij’s

1 1 1

1G G G

i j i jij i

j j j

L L L LA L G

N N

1, max( ) iL G where L L

Page 43: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Constraints for the TDM Schedule

1. Latin Square: In any period N, each transmitting linecard is connected to each receiving linecard exactly once.

2. MEMS constraint: In any time-slot, there are at most Aij connections between transmitting group i and receiving group j, where:

i jij

L LA

N

Page 44: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Example

Assume L1=3, L2=2, L3=1

Then

E.g., at most 2 packets from the first group to the first group at each time-slot

2 1 1

1 1 1

1 1 1

i j

ij

ij

L LA

N

Page 45: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Bad TDM Transmit Schedule

t = 0 t = 1 t = 2 t = 3 t = 4 t = 5

LC 1 1 2 3 4 5 6

LC 2 6 1 2 3 4 5

LC 3 5 6 1 2 3 4

LC 4 4 5 6 1 2 3

LC 5 3 4 5 6 1 2

LC 6 2 3 4 5 6 1

Page 46: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Good TDM Transmit Schedule

t = 0 t = 1 t = 2 t = 3 t = 4 t = 5

LC 1 1 2 3 4 5 6

LC 2 5 1 2 3 6 4

LC 3 6 5 4 1 2 3

LC 4 2 3 1 6 4 5

LC 5 4 6 5 2 3 1

LC 6 3 4 6 5 1 2

Page 47: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Configuration Algorithm

1. Assign connections between groups, so MEMS constraint is satisfied.

2. Assign group connections to specific linecards, so there is exactly one connection per linecard pair in the schedule.

Comments: Algorithm is surprisingly complex. Best running time so far: 40 seconds for 640 linecards.

Page 48: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

Challenges

RWDM

1

G

GROut

WDM

1

G

WDMPkt

Switch

1

G

R R

G

G2

R=160Gb/s

R

4

21

WDM

1

G

GAddressLookup

11

R R

3

In

How to build a 250ms

160Gb/s buffer?

Low-cost, low-power

optoelectronic conversion?

Page 49: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

What we are building

Buffer Manager

90nm ASIC

Buffer Manager

90nm ASIC

250ms DRAM

160Gb/s 160Gb/s

320Gb/sChip #1: 160Gb/s Packet Buffer

CMOS ASIC

16 x 10Gb/s

To Linecards To Optical Fabric

Chip #2: 16 x 55 Opto-electronic crossbar

55 x 10Gb/s55 x 10Gb/s

1500nm Optical source

Optical DetectorOptical Modulator

Page 50: Scaling Internet Routers Using Optics UW, October 16 th, 2003 Nick McKeown Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard.

100Tb/s Load-Balanced Router

L = 16160Gb/s linecards

Linecard Rack G = 40

L = 16160Gb/s linecards

Linecard Rack 1

L = 16160Gb/s linecards

55 56

1 2

40 x 40MEMS

Switch Rack < 100W