Top Banner
Some Unsolved Problems in High Speed Packet Swtiching Shivendra S. Panwar Joint work with : Yihan Li, Yanming Shen and H. Jonathan Chao Polytechnic University, Brooklyn, NY NY State Center for Advanced Technology in Telecommunications http://catt.poly.edu/CATT/panwar.html
55

Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

Jul 27, 2018

Download

Documents

ngotuong
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

Some Unsolved Problems in High Speed

Packet SwtichingShivendra S. Panwar

Joint work with: Yihan Li, Yanming Shen and H. Jonathan Chao

Polytechnic University, Brooklyn, NYNY State Center for Advanced Technology in Telecommunications

http://catt.poly.edu/CATT/panwar.html

Page 2: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

2

Advice to Woodward and Bernstein:

“Follow the money” -- Deep Throat

(aka Mark Felt)

Page 3: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

3

Advice to performance analysts:

“Find the bottleneck”

Page 4: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

4

Packet Switching

Page 5: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

5

Buffering in a Packet Switch

Fixed-size packet switchesOperates in a time-slotted mannerThe slot duration is equal to the cell transmission time

Contention occurs when multiple inputs have arrivals destined to the same outputBuffering is needed to avoid packet lossBuffering schemes in a packet switch

Output queueing (IQ)Input queueing (OQ)Virtual output queueing (VOQ) / combined input-output-queueing (CIOQ)

Page 6: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

6

Output Queuing (OQ)100% throughputInternal speedup of N

Impractical for large N

Input 1

Input 2

Input 3

Input 4

Output 1

Output 2

Output 3

Output 4

3

3

3

3

Page 7: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

7

Input Queuing (IQ)Easy to implementHOL Blocking, throughput 58.6%

Head of Line Blocking

Input 1

Input 2

Input 3

Input 4

Output 1

Output 2

Output 3

Output 4

12

32

34

24

Page 8: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

8

Virtual Output Queuing (VOQ)Virtual Output Queuing (VOQ)

Overcome HOL blockingNo speedup requirementNeed scheduling algorithms to resolve contention

ComplexityPerformance guarantee

1234

1234

1234

1234

Page 9: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

9

Challenges in Switch Design

Stability100% throughput

Delay performanceScalability

Scale to high number of linecards and to high linecard speedsDistributed scheduler is more desirable than a centralized scheduler

Scheduler complexityPin count

Page 10: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

10

High Speed Packet Switches

VOQ switches and scheduling algorithmsBuffered crossbar switchLoad Balanced switchMulti-stage switch

Page 11: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

11

VOQ Switch Architecture

Input 1

Input 2

Input 3

Input 4

Output 1

Output 2

Output 3

Output 4

Switch Fabric

VOQISM ORM1

N

1

N

1

N

1

N

1

N

1

N

1

N

1

N

Input Segmentation Module (ISM): Segment packets to fixed-length cells.Output Reassembly Module (ORM): Reassemble cells into packets.

Page 12: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

12

Scheduling for VOQ Switch

Scheduling is needed to avoid output contentionA scheduling problem can be modeled as a matching problem in a bipartite graph

An input and an output are connected by an edge if the corresponding VOQ is not emptyEach edge may have a weight, which can be

The length of the VOQThe age of the HOL cell

Page 13: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

13

Maximum Weight Matching (MWM)MWM always finds a match with the maximum weightStable under any admissible trafficVery high complexity

O(N3), impractical

7

43

7

8

56

10

5

2

Weight of the match: 25N. McKeown, V. Anantharam, and J. Walrand, “Achieving 100% Throughput in an Input-Queued Switch,” IEEE Transaction on Comm., vol. 47, no. 8, Aug. 1999, pp. 1260-1267.J.G. Dai and B. Prabhakar, “The throughput of data switches with and without speedup,” INFOCOM 2000.

ReferencesL. Tassiulas, A. Ephremides, ``Stability properties of constrained queueing systems and scheduling for maximum throughput in multihop radio networks,'' IEEE Transactions on Automatic Control, Vol. 37, No. 12, pp. 1936-1949, December 1992. E. Leonardi, M. Mellia, F. Neri, Marco A. Marsan, “On the stability of Input-Queued Switches with speed-up”, IEEE/ACM Transactions on Networking, Vol.9, No.1, pp.104-118, ISSN: S 1063-6692(01)01313, February 2001

Page 14: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

14

Maximum Weight MatchingThe maximum weight matching algorithm is strongly stable under any admissible traffic pattern

Lyapunov functionStrongly stable

Admissible

ReferencesEmilio Leonardi, Marco Mellia, Fabio Neri, Marco Ajmone Marsan, “On the stability of Input-Queued Switches with speed-up”, IEEE/ACM Transactions on Networking, Vol.9, No.1, pp.104-118, ISSN: S 1063-6692(01)01313, February 2001 N. McKeown, V. Anantharam, and J. Walrand, “Achieving 100% Throughput in an Input-Queued Switch,” IEEE Transaction on Comm., vol. 47, no. 8, Aug. 1999, pp. 1260-1267.

∑ <i

ij 1λ ∑ <j

ij 1λ

∞<∞→ ||][||suplim nn QE

Page 15: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

15

Maximum Weight Matching

Fluid modelThe maximum weight matching is rate stable if:

The arrival processes satisfy a strong law of large numbers (SLLN) with probability one

, and

ReferencesJ.G. Dai and B. Prabhakar, “The throughput of data switches with and without speedup,” INFOCOM 2000, pp. 556-564.

Njin

nAij

ij

n,...,2,1,,

)(lim ==

∞→λ

∑ ≤i

ij 1λ ∑ ≤j

ij 1λ

Page 16: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

16

Approximate MWM1-APRX

A function f(.) is a sub-linear function if limx→∞ f(x)/x = 0Let the weight of a schedule obtained by a scheduling algorithm B be WBLet the weight of the maximum weight match for the same switch state be W*

If WB ≥ W* - f(W*)

B is a 1-APRX to MWMB is stable if

Makes it possible to find stable matching algorithms with lower complexity than MWM.References

D. Shah, M. Kopikare, “Delay bounds for approximate Maximum weight matching algorithms for input-queued switches”, IEEE INFOCOM, New York, USA, June 2002.

,)),(*()(*)( ttWftWtW B ∀−≥

Page 17: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

17

Average Delay BoundDelay bound for MWM

Lyapunov function

ReferencesE. Leonardi, M. Melia, F. Neri, and M. Ajmone Marson. Bounds on average delays and queue size averages and variances in input-queued cell-based switches. Proceedings of IEEE INFOCOM, 2001.

ρρρ−

−≤

1/||])([||

22 NNtQE

Page 18: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

18

Average Delay Bound (contd.)Delay bound for approximate-MWM

Lyapunov function

Cb: weight difference to the MWM matching

Uniform traffic, they have the same result

ReferencesD. Shah, M. Kopikare, “Delay bounds for approximate Maximum weight matching algorithms for input-queued switches”, IEEE INFOCOM, New York, USA, June 2002.

δδ 2||])([||lim

~

bt

NCNtQE +Λ

≤∞→

∑−=j iji λδ max1 ∑ −=Λ

ij ijij )( 2~

λλ

)1(1

||])([|| 2

NNtQE ρ

ρρ

−−

Page 19: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

19

Open Issues

With simulations, MWM has the best delay performance (Cell delay)

Average delay: Choose the weight of a queue as Qa , then delayis increasing with a for a>0

Is MWM the optimal scheduling scheme for achieving the minimum average cell delay?What is the optimal scheduling scheme to achieve the minimum average packet delay (Including reassembly delay)?

Page 20: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

20

Maximal Matching Maximal Matching

Add connections incrementally, without removing connections made earlierNo more matches can be made trivially by the end of the operationSolution may not be uniqueComplexity O(NlogN)

7

43

7

8

56

10

5

2

Weight of the match: 23

Page 21: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

21

Maximal MatchingA maximal matching achieves 100% throughput with speed-up S≥2 under any admissible traffic pattern

[Leonardi, ToN 2001]100% throughput

if with probability 1

A maximal matching algorithm is rate stable with speed-up S≥2 [Dai, Infocom 2000]

ReferencesEmilio Leonardi, Marco Mellia, Fabio Neri, Marco Ajmone Marsan, “On the stability of Input-Queued Switches with speed-up”, IEEE/ACM Transactions on Networking, Vol.9, No.1, pp.104-118, ISSN: S 1063-6692(01)01313, February 2001J.G. Dai and B. Prabhakar, “The throughput of data switches with and without speedup,” INFOCOM 2000, pp. 556-564.

0)()/1(lim)/(lim0

=−= ∑ =∞→∞→n

i iinnn DAnnQ

Page 22: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

22

Multiple Iterative MatchingUse multiple iterations to converge on a maximal matching Parallel Iterative Matching (PIM)iSLIP and DRRM

complexity of each iteration is O(logN)O(logN) iterations are needed to converge on a maximal matching (iSLIP)100% throughput only under uniform traffic

Page 23: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

23

iSLIPStep 1: Request

Each input sends a request to everyoutput for which it has a queued cell.

Step 2: GrantIf an output receives multiple requests it chooses the one that appears next in a fixed round-robin schedule. The output arbiter pointer is incremented by one location beyond the granted input if, and only if, the grant is accepted in step 3.

Step 3: AcceptIf an input receives multiple grants, it accepts the one that appears next in a fixed round-robin schedule. The input arbiter pointer is incremented by one location beyond the accepted output.

Input Output

RequestGrantAccept

Page 24: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

24

Achieving 100% Throughput without Speedup

Matching algorithms using memoryPolling system based matching

Page 25: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

25

Low Complexity Algorithms with 100% Throughput

Algorithms with memoryUse the previous schedule as a candidateReferences

L. Tassiulas, “Linear complexity algorithms for maximum throughput in radio networks and input queued switches,” IEEE INFOCOM 1998, vol.2, New York, 1998, pp.533-539.P. Giaccone, B. Prabhakar, D. Shah “Toward simple, high-performance schedulers for high-aggregate bandwidth switches”, IEEE INFOCOM 2002, New York, 2002.

Polling system based matching algorithmsImprove the efficiency by using exhaustive serviceReferences

Y. Li, S. Panwar, H. J. Chao, “Exhaustive service matching algorithms for input queued switches,” 2004 Workshop on High Performance Switching and Routing (HPSR 2004), April 2004.Y. Li, S. Panwar, H. J. Chao, “ Performance Analysis of a Dual Round Robin Matching Switch with Exhaustive Service,” IEEE GLOBECOM 2002.

Page 26: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

26

Matching Algorithms with MemoryThe queue length of each VOQ does not change much during successive time slots

In each time slot, there can beAt most one cell arrives to each inputAt most one cell departs from each input

It is likely that a busy connection will continue to be busy over a few time slots, if the queue length is used as the weight of a connectionUse the match in the previous time slot as an candidate for the new matchImportant results:

Randomized algorithm with memory [Tassiulas 98]Derandomized algorithm with memory [Giaccone 02]With higher complexity: APSARA, LAURA, SERENA [Giaccone 02]

Page 27: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

27

Notations

For a NxN switch, there are N! possible matches

Q(t)=[qij]NxN, qij is the queue length of VOQij

M(t), a match at time t

The weight of M(t)W(t)=<M(t),Q(t)>

the sum of the lengths of all matched VOQs

Page 28: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

28

Randomized algorithm with memory

Randomized algorithm with memoryLet S(t) be the schedule used at time tAt time t+1, uniformly select a match R(t+1) at random from

the set of all N! possible matchesLet

Stable under any Bernoulli i.i.d. admissible arrival trafficVery simple to implement, complexity O(logN)Delay performance is very poor

)1(,maxarg)1()}1(),({

+=++∈

tQStStRtSS

Page 29: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

29

Derandomized Algorithm with Memory

Hamiltonian walk A walk which visits every vertex of a graph exactly once.In a NxN switch,

N! vertices (possible schedules), a Hamiltonian walk visits each vertex once every N! time slotsH(t): the value of the vertex which is visited at time tThe complexity of generating H(t+1) when H(t) is known is O(1)

Derandomized algorithm with memoryUse the match generated by Hamiltonian walk instead of the random matchSimilar performance as randomized algorithm

Page 30: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

30

Compared to MWM …

Simple matching algorithms can achieve stability as MWM doesNot necessary to find “the best match” in each time slot to achieve 100% throughputMWM has much better delay performance than randomized and derandomized matching

“better” matches lead to better delay performance

Page 31: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

31

With Higher Complexity and Lower Delay

Introduce higher complexity for much lower delay than the randomized and derandomized algorithmsAPSARA

include the neighbors of the latest match as candidates

LAURA: merge the latest match with a random match to remember the heavy edges

SERENAMerge the latest match with the arrival figure

Figure: generated from the current arrival patternComplexity O(N)

Page 32: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

32

Polling System Based MatchingExhaustive Service Matching

Inspired by exhaustive service polling systemsAll the cells in the corresponding VOQ are servedafter an input and an output are matchedSlot times wasted to achieve an input-output matchare amortized over all the cells waiting in the VOQinstead of only oneCells within the same packet are transferredcontinuously

Hamiltonian walk is used to guarantee stability

Page 33: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

33

Exhaustive Service Matching with Hamiltonian Walk (EMHW)

EMHWLet S(t) be the match at time t.At time t+1, generate match Z(t+1) by the Exhaustive Service Matching algorithm based on S(t), and H(t+1) by Hamiltonian walk

Let

where <S,Q(t+1)> is the weight of S at time t+1.

Stable under any admissible trafficAnalyzed by an exhaustive service polling systemImplementation complexity

HE-iSLIP: O(logN)

)1(,maxarg)1()}1(),1({

+=+++∈

tQStStHtZS

Page 34: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

34

E-iSLIP Average Delay AnalysisExhaustive random polling system model

Symmetric system -- only consider one inputN VOQs per input, exhaustive service policy -- an exhaustiveservice polling system with N stationsThe service order of the VOQs are not fixed -- random pollingsystem, assume all station VOQs have the same probability ofselection for service after a VOQ is served

Switch over time S[ ],)1(11)1(

11 1

1

mmmNN

m mmN

Q ωω

ρρ −−−⎟⎟⎠

⎞⎜⎜⎝

⎛−−

= −−

=∑,1)(

QQSE −

= whereQ

QQ

QSE ,1)1(21)( 2⎥⎦

⎤⎢⎣

⎡+

−−=

]1

)1(1

)1()1(

[21)(

22

μμμ

μμσδ

NrN

NNr

NrTE

−−

+−−

+−

+=

.,),()()(),( 2222

NNSESESVarSEr ρσρμδ ==−===

[ ].)1(11 m

mρω −−=

Average delay T [Levy and Kleinrock]

Page 35: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

35

Delay Performance of HE-iSLIPPacket delay: the sum of cell delay and reassembly delayCell delay: measured from VOQ to destination outputReassembly delay: time spent in an ORM, often ignored inother work

Input 1

Input 2

Input 3

Input 4

Output 1

Output 2

Output 3

Output 4

Switch Fabric

VOQISM ORM1

N

1

N

1

N

1

N

1

N

1

N

1

N

1

N

Page 36: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

36

Performance Summaryschemes complexity stable packet delay performance

iSLIP O(logN) No Always higher than HE-iSLIP.

HE-iSLIP O(logN) Yes Lowest when packet size is larger than 1 cell.

Derandomized O(logN) Yes Highest for all traffic patterns.

SERENA O(N) Yes Lower than HE-iSLIP only under nonuniform diagonal traffic.

MWM O(N3) Yes Lowest when packet size is 1 cell.

Page 37: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

37

Packet Delay under Uniform TrafficPattern 1: packet size is 1 cell.

MWM

HE-iSLIP

SERENA

iSLIP

Page 38: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

38

Packet Delay under Uniform TrafficPattern 2: packet length is 10 cells

Pattern 3: packet length is variable, the average is 10 cells (Internet packet size distribution)

MWM

HE-iSLIP

HE-iSLIP

MWMSERENA

iSLIP iSLIP

SERENA

Page 39: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

39

When packet length is larger than 1 cellWhy does HE-iSLIP have a lower packet delay than MWM?For example, when packet length is 10 cells:

Cell delay Reassembly delay

Low cell delay + low reassembly delay needed for low packet delay

HE-iSLIP

MWM

HE-iSLIP

MWM

Open Problem: Which scheduler minimizes packet delay performance?

Page 40: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

40

Packet-Based SchedulingPacket-based scheduling algorithm

once it starts transmitting the first cell of a packet to an output port, it continues the transmission until the whole packet is completely received at the corresponding output port

Packet-based MWM is stable for any admissible Bernoulli i.i.d. traffic

Lyapunov function, MA. Marsan, A. Bianco, P. Giaccone, E. Leonardi, and F. Neri, “Packet Scheduling in Input-Queued Cell-Based Swithces,” INFOCOM 2001, pp. 1085-1094.

Packet-based MWM is stable under regenerative admissible input traffic

Fluid model, Y. Ganjali, A. Keshavarzian, D. Shah, “Input Queued Switches: Cell switching v/s Packet switching", Proceedings of Infocom, 2003.regenerative: Let T be the time between two successive occurrences of the event that all ports are free with E(T) being finiteModified waiting PB-MWM algorithm is stable under any admissible traffic

Page 41: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

41

Buffered Crossbar Switch

Distributed arbitration for inputs and outputs

From each input, one cell can be sent to a crosspoint buffer if it has spaceOne cell can be sent to an output if at least one crosspoint buffer to that output is nonempty

ReferencesY. Doi and N. Yamanaka, “A High-Speed ATM Switch with Input and Cross-Point Buffers,” IEICE TRANS. COMMUN., VOL. E76, NO.3, pp. 310-314, March 1993.R. Rojas-Cessa, E. Oki, Z. Jing, and H. J. Chao, “CIXB-1: Combined Input-One-Cell-Crosspoint Buffered Switch,” Proceedings of IEEE Workshop of High Performance Switches and Routers 2001.

One buffer for each crosspoint

Page 42: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

42

Birkhoff-von Neumann SwitchWhen traffic matrix is known

Birkhoff-von Neumann decompositionReference

Cheng-Shang Chang, Wen-Jyh Chen and Hsiang-Yi Huang, "On service guarantees for input buffered crossbar switches: a capacity decomposition approach by Birkhoff and von Neumann," IEEE IWQoS'99, pp. 79-86, London, U.K., 1999.

Page 43: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

43

Birkhoff-von Neumann SwitchExample

High complexity, impractical

0

Page 44: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

44

Load-Balanced Switch

Switching

......

......

......

Load-balancing

……

1

k

N

Load-balanced switchConvert the traffic to uniform, then fixed switching100% throughput for broad class of trafficNo centralized scheduler needed, scalable

Page 45: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

45

Original Work on LB SwitchStability: the load-balanced switch is stableDelay: burst reductionProblem: unbounded out-of-sequence delaysReference

C.-S. Chang, D.-S. Lee and Y.-S. Jou, “Load balanced Birkhoff-von Neumann switches, Part I: one-stage buffering,” Computer Comm., Vol. 25, pp. 611-622, 2002.

Page 46: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

46

LB Switch variantsSolve the out-of-sequence problem

FCFS (First come first serve) Jitter control mechanism

Increase the average delayEDF (Earliest deadline first)

Reduce the average delayHigh complexity

Mailbox switchPrevent packets from being out-of-sequenceNot 100% throughput

ReferencesC.-S. Chang, D.-S. Lee and C.-M. Lien, “Load balanced Birkhoff-von Neumann switches, Part II: multi-stage buffering,” Computer Comm., Vol. 25, pp. 623-634, 2002.C.S. Chang, D. Lee, and Y. J. Shih, “Mailbox switch: A scalable twostage switch architecture for conflict resolution of ordered packets,” In Proceedings of IEEE INFOCOM, Hong Kong, March 2004.

Page 47: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

47

More LB switch variantsFFF (Full frames first) (Infocom 2002, Mckeown)

Frame-basedNo need for resequencingRequire multi-stage buffer communication-high complexity

FOFF (Full ordered frames first) (Sigcomm 2003, Mckeown)

Frame-basedMaximum resequencing delay N2

Bandwidth wastage

ReferencesI. Keslassy and N. McKeown, “Maintaining packet order in two-stage switches,” Proc. of the IEEE Infocom, June 2002.I. Keslassy, S.-T. Chuang, K. Yu, D. Miller, M. Horowitz, O. Solgaard and N. McKeown , “Scaling Internet routers using optics,” ACM SIGCOMM ’03, Karlsruhe, Germany, Aug. 2003.

Page 48: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

48

Byte-Focal Switch Architecture

Input VOQArrival2nd stage switch fabric

Second-stage VOQ

Re-sequencing buffer

i

1

N

(1,1)

(1,N)

(1,k)

(i,1)

(i,k)

......

(i,N)

……

(N,1)

(N,k)

(N,N)

......

......

......

......

......

(1,1)

(1,k)

(1,N)

(j,1)

1

j

N

(j,k)

(j,N)

(N,1)

(N,k)

(N,N)

1st stage switch fabric

……

1

k

N

12

N

12

N

12

N

……

1

i

N

Page 49: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

49

Byte-Focal Switch

Packet-by-packet schedulingImproves the average delay performance

The maximum resequencing delay is N2

The time complexity of the resequencing buffer is O(1)Does not need communications between linecardsReferences

Y. Shen, S. Jiang, S.S.Panwar, H.J. Chao, “Byte-Focal: a practical load-balanced swtich”, HPSR 2005, Hongkong.

Page 50: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

50

Multi-Stage SwitchesSingle Stage Switches (e.g., Cross-point switch)

Single path between each input-output pairCannot meet the increasing demands of Internet traffic

No packets out-of-sequenceEasy to designLack of scalability

Multi-stage Switches (e.g., Clos-network switch)Multiple paths between each input-output pair

Better tradeoff between the switch performance and complexityHighly scalable and fault tolerant Memory-less multi-stage switches

No packets out-of-sequence, may encounter internal blockingBuffered multi-stage switches

Packet may be out-of-sequence, easy scheduling

Page 51: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

51

Multi-Stage Architecture

Page 52: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

52

Trueway: A Multi-Plane Multi-Stage Switch

TMI(0)

TMI(n-1)

TMI(N-n)

TMI(N-1)

TME(0)

TME(n´)-1)

TME(N-n)

TME(N-

IM(0)n x m

OM(0)m x n

CM(0)k x k

IM(k-1)n x m

OM(k-1)m x n

CM(m-1)k x k

n

n

m

m

k

k

k

k

m

m

n

n

p

p

p

p

Plane(0)

IM(0)n x m

OM(0)m x n

CM(0)k x k

IM(k-1)n x m

OM(k-1)m x n

CM(m-1)k x k

n

n

m

m

k

k

k

k

m

m

n

n

p

p

p

p

Plane(p-1)

TMI(0)

TMI(n´)

TMI(N-n)

TMI(N´)

TME(0)

TME(n´)

TME(N-n)

TME(N´)

IM(0)n x m

OM(0)m x n

CM(0)k x k

IM(k´)n x m

OM(k´)m x n

CM(m´)k x k

n

n

m

m

k

k

k

k

m

m

n

n

p

p

p

p

Plane(0)

IM(0)n x m

OM(0)m x n

CM(0)k x k

IM(k´)n x m

OM(k´)m x n

CM(m´)k x k

n

n

m

m

k

k

k

k

m

m

n

n

p

p

p

p

Plane(p´)

Page 53: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

53

Trueway SwitchThe switch fabric consists of multiple switching planes, with each being a three-stage Clos network with mcenter modulesEach input/output pair has multiple routing pathsHighly scalable

1 n

1

2

n

Cross-point buffered memory2

Page 54: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

54

Challenges in Multi-Stage Switching

How to efficiently allocate and share the limited on-chip memory?How to schedule packets on multiple paths to maximize memory utilization and system performance?

How to minimize link congestion and prevent buffer overflow (i.e., stage-to-stage flow control)?How to maintain cells/packet order if they are delivered over multiple paths (i.e., port-to-port flow control)?How to achieve 100% throughput?

Page 55: Some Unsolved Problems in High Speed Packet Switchingpanwar/recent_talks/Some Unsolved Problems in... · Some Unsolved Problems in High Speed Packet Swtiching ... W. B ≥ W* - f(W*)

55

Conclusion

Introduced switch architecture trendsMany open research problemsBottleneck keeps changing!