Top Banner

of 55

Packet Switching Eng 3pp

Apr 14, 2018

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 7/30/2019 Packet Switching Eng 3pp

    1/55

    1

    1

    Scheduling algorithms forinput-queued IP routers

    Andrea Bianco Paolo Giaccone

    Gruppo Reti di TelecomunicazioniDipartimento di Elettronica

    Politecnico di Torinohttp://www.tlc-networks.polito.it

    Switching Architectures 2011/12

    2

    Outline

    IP routers

    OQ routers IQ routers Scheduling

    Optimal algorithms

    Heuristic algorithms

    Packet-mode algorithms

    Networks of routers QoS support

    CIOQ routers

    Multicast traffic

    Conclusions

    Switching Architectures 2011/12

    3

    Note

    The slides marked RWP are reproduced withpermission of Prof.Nick McKeown from theElectrical Engineering and Computer ScienceDept. of Stanford University (CA,USA)

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    2/55

    2

    4

    Outline

    IP routers

    OQ routers

    IQ routers Scheduling

    Optimal algorithms

    Heuristic algorithms

    Packet-mode algorithms

    Networks of routers

    QoS support

    CIOQ routers

    Multicast traffic

    Conclusions

    Switching Architectures 2011/12

    5

    The Internet is a mesh of routers

    corerouter

    accessrouter

    enterpriserouter

    Switching Architectures 2011/12

    6

    Access router:

    high number of ports at low speed (kbps/Mbps)

    several access protocols (modem, ADSL, cable)

    Enterprise router:

    medium number of ports at high speed (Mbps)

    several services (IP classification, filtering)

    Core router:

    low number of ports at very high speed (Mbps/Gbps)

    very high throughput

    few services

    The Internet is a mesh of routers

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    3/55

    3

    7

    Basic architecture

    Control Plane

    Datapathper-packetprocessing

    SwitchingForwarding

    Table

    Routingtable

    Routingprotocols

    Switching Architectures 2011/12

    8

    Basic functions

    Routing

    computation of the output port ofan incoming packet (forwarding)

    uses the routing tables computed bythe routing protocols

    can be a complex procedure: very large routing tables

    dynamic variation of routes in the Internet

    Switching Architectures 2011/12

    9

    Basic functions

    Switching transfer of packets from input ports

    to output ports

    solution of the contentions for output ports

    queueing methods where to store

    scheduling methods what to transfer

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    4/55

    4

    Faster and faster

    Need for high performance routers to accommodate the bandwidth demands

    for new users and new services

    to support QoS (over-provisioning)

    to reduce costs with respect to a cloud of smaller sizerouters (maybe)

    a smaller number of fibers is needed

    a smaller number of devices (but it is less costly?)

    May be more energy hunghry

    to ease of the management task

    10Switching Architectures 2011/12

    11

    Packet processing and link speed

    0,1

    1

    10

    100

    1000

    10000

    1985 1990 1995 2000

    Spec95In

    tCPUr

    esults

    0,1

    1

    10

    100

    1000

    10000

    1985 1990 1995 2000FiberCapacity(Gbit/s)

    TDM DWDM

    Packet processing Power Link Speed

    Moores law2x / 18 months

    2x / 7 months

    Source: SPEC95Int & David Miller, Stanford.

    RWP

    Increase of electronic packet processing power cannotaccommodate the increase in link speed

    ?

    Switching Architectures 2011/12

    120,001

    0,01

    0,1

    1

    10

    100

    1000

    1980 1983 1986 1989 1992 1995 1998 2001

    AccessTime

    (ns)

    Moores Law2x / 18 months

    1.1x / 18 months

    RWP

    Memory access time

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    5/55

    5

    13

    Its hard to keep up with Moores law:

    the bottleneck is memory speed

    Moores law is too slow:

    routers need to improve fasterthan Moores law

    RWP

    Moores law

    Switching Architectures 2011/12

    14

    Router capacity exceeds Moores law

    Growth in capacity of commercial routers: 1992 ~ 2 Gb/s

    1995 ~ 10 Gb/s

    1998 ~ 40 Gb/s

    2001 ~ 160 Gb/s

    2003 ~ 640 Gb/s

    Average growth rate: 2.2x / 18 months

    RWPSwitching Architectures 2011/12

    15

    Single packet processing

    The time to process one packetis becoming shorter and shorter

    worst case: 40-Byte packets (ACKs)travelling over the Internet

    3.2 s at 100 Mbps

    320 ns at 1 Gps

    32 ns at 10 Gps

    3.2 ns at 100 Gbps

    320 ps at 1Tbps

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    6/55

    6

    16

    S F

    LC

    LC

    LC

    LC

    CP

    S F

    IP

    IP

    IP

    IP

    CP

    OP

    OP

    OP

    OP

    Hardware architecture

    physical structure logical structure

    Switching Architectures 2011/12

    17

    Hardware architecture

    Main elements

    line cards support input/output transmissions

    adapt packets to the internal format of the switching fabric

    support data link protocols

    In most architectures

    store packets

    classify packets schedule packets

    support security

    switching fabric transfers packets from input ports to output ports

    Switching Architectures 2011/12

    18

    control processor/network processor runs routing protocols

    computes and stores routing tables

    manages the overall system

    sometimes

    store packets

    classify packets

    schedule packets

    support security

    forwarding engines inspect packet headers

    compute the packet destination (lookup)

    Searching routing or forwarding (chaching) tables

    rewrite packet headers

    Hardware architecture

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    7/55

    7

    19

    switching fabric

    line card line card

    control processor &

    forwarding engine

    1 N

    Interconnections among mainelements - I

    Switching Architectures 2011/12

    20

    switching fabric

    line card line card

    control

    processor

    forwarding

    engine

    forwarding

    engine

    1 N

    Interconnections among mainelements - II

    Switching Architectures 2011/12

    21

    Interconnections among main

    elements - II

    switching fabric

    line card &forwarding engine

    control

    processor

    1

    line card &forwarding engine

    N

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    8/55

    8

    Cell-based routers

    ISM: Input-Segmentation Module

    ORM: Output-Reassembly Module

    packet: variable-sizedata unit

    cell: fixed-size dataunit

    22

    Cell switch (fabric) ORM1

    ORMN

    1

    ISM

    N

    ISM

    packets cells cells packets

    Switching Architectures 2011/12

    23

    Switching fabric

    Our assumptions:

    bufferless

    to reduce internal hardware complexity

    non-blocking

    given a non-conflicting set of inputs/outputs,

    it is always possible to connect inputs withoutputs

    Switching Architectures 2011/12

    24

    Switching fabric

    Examples:

    bus

    shared memory

    crossbar

    Multi-stage

    rearrangeable Clos network

    Benes network

    Batcher-Banyan network (self-routing)

    Switching constraints

    at most one cell for each input and for each outputcan be transferred

    1234

    1 2 3 4

    outputs

    inputs

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    9/55

    9

    25

    Switching fabric

    We do not discuss switching fabrics withinternal buffers

    e.g.: crossbars with buffer at each crosspoint

    Switching Architectures 2011/12

    26

    Generic switching architecture

    Output 1

    switching fabric

    Input 1

    Input NOutput N

    Sin

    Sin

    Sout

    Sout

    input queues output queues

    Switching Architectures 2011/12

    27

    Speedup

    The speedup limits the switch performance

    Sin = reading speed from input queues

    Sout = writing speed to output queues

    The main performance limit can be due tothe maximum speedup factor:

    S = max(Sin,Sout)

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    10/55

    10

    28

    Performance comparison

    The performance of different switchingsystems can be studied

    with analytical models

    introducing simplifying assumptions, butobtaining general results

    with simulation models

    obtaining more detailed results

    Switching Architectures 2011/12

    29

    Traffic description

    Aij(n) = 1 if a packet arrives at time n at input i,with destination reachable through output j

    ij = E[Aij(n)]

    An arrival process is admissibleif:

    i ij < 1

    j ij < 1

    that is, no input and no output are overloadedon average

    note that OQ switches exhibit finite delays only

    for admissible traffic

    traffic matrix: = [ij ]Switching Architectures 2011/12

    30

    Traffic scenarios

    Uniform traffic Bernoulli i.i.d. arrivals

    usual testbed in the literature

    easy to schedule

    Diagonal traffic Bernoulli i.i.d arrivals

    critical to schedule, since

    only two matchings are good

    =

    2001

    1200

    0120

    0012

    3

    =

    1111

    1111

    1111

    1111

    N

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    11/55

    11

    31

    Traffic scenarios

    LogDiagonal traffic

    Bernoulli i.i.d. arrivals

    more critical than uniform,less than diagonal traffic

    =

    8124

    4812

    2481

    1248

    12N

    Switching Architectures 2011/12

    32

    Outline

    IP routers

    OQ routers IQ routers Scheduling

    Optimal algorithms

    Heuristic algorithms

    Packet-mode algorithms

    Networks of routers QoS support

    CIOQ routers

    Multicast traffic

    Conclusions

    Switching Architectures 2011/12

    33

    Output Queued (OQ) switches

    Sin = 1 Sout = N

    used for low bandwidth routers

    no coordination among ports

    work-conserving

    best average delays

    complete control of delays

    support of QoS scheduling

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    12/55

    12

    34

    Output Queued (OQ) switch

    speedup N

    Output N

    Output 1

    switching fabric

    Input 1

    Input N

    Switching Architectures 2011/12

    35

    0% 20% 40% 60% 80% 100%

    Normalized load

    Delay

    OQ performance

    OQ

    Note: OQ is optimal from the pointof view of average delay and

    throughput

    Uniform traffic

    Switching Architectures 2011/12

    36

    Stability, throughput and delays

    Hp: stationary system, infinite queue

    for a particular in stable finite occupancy finite delays

    in= out

    100% throughput stable under any inadmissible

    in out

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    13/55

    13

    37

    Stability, throughput and delays

    outmax

    stable(in) in= out inmax

    hence, max is maximum throughput achievable

    maximum offered load for stability

    unstable (in) in> max

    queue grows with rate in- max

    in

    out

    max

    in

    delay

    max

    max

    Switching Architectures 2011/12

    38

    Outline

    IP routers

    OQ routers IQ routers Scheduling

    Optimal algorithms

    Heuristic algorithms

    Packet-mode algorithms

    Networks of routers QoS support

    CIOQ routers

    Multicast traffic

    Conclusions

    Switching Architectures 2011/12

    39

    Simple Input Queued (IQ) switches

    Sin = 1 Sout = 1

    1 FIFO queue for each input port

    throughput limitations

    due to head of the line (HOL) blocking

    scheduling

    to solve contentions

    for the same output

    Output N

    Input 1 Output 1

    switching fabric

    Input 1

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    14/55

    14

    40

    Head of the Line (HOL) Blocking

    RWPSwitching Architectures 2011/12

    41

    0% 20% 40% 60% 80% 100

    Normalized load

    Delay

    Simple IQ switch performance

    OQSimpleIQ

    Uniform traffic

    %5822

    Switching Architectures 2011/12

    Single IQ switch

    Using a simple Markov chain model

    2x2 throughput 0.75

    states: (2,0), (1,1)

    3x3 throughput ?

    states: (3,0,0), (2,1,0), (1,1,1)

    Switching Architectures 2011/12 42

  • 7/30/2019 Packet Switching Eng 3pp

    15/55

    15

    Bufferless switch

    Throughput= uniform i.i.d. Bernoulli arrivals

    input load p

    Switching Architectures 2011/12 43

    63.0111 1 eNp

    N

    44

    Improving IQ switches performance

    Window/bypass schedulers

    the first w cells of each queue contendfor outputs

    HOL blocking is reduced, not eliminated

    w = 1 means FIFO at each input

    higher complexity

    the scheduler deals with wN cells

    non-FIFO queues

    Switching Architectures 2011/12

    45

    Improving IQ switches performance

    Maximum throughput in an NxN switch withvariable window size w

    N W=1 W=2 w=3 w=4 W=5 W=6 W=7 W=8

    2 0.75 0.84 0.89 0.92 0.93 0.94 0.95 0.96

    4 0.66 0.76 0.81 0.85 0.87 0.89 0.91 0.92

    8 0.62 0.72 0.78 0.82 0.85 0.87 0.88 0.89

    16 0.60 0.71 0.77 0.81 0.84 0.86 0.87 0.88

    32 0.59 0.7 0.76 0.8 0.83 0.85 0.87 0.88

    64 0.59 0.7 0.76 0.8 0.83 0.85 0.86 0.88

    128 0.59 0.7 0.76 0.8 0.83 0.85 0.86 0.88

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    16/55

    16

    46

    Improving IQ switches performance

    Virtual output queueing (VOQ) one queue for each input/output pair

    N queues at each input

    N2 queues in the whole switch

    eliminates HOL blocking

    used in high-bandwidth routers

    scheduling implemented in hardwareat very high speed

    Switching Architectures 2011/12

    47

    IQ switches with VOQ

    Output N

    Input 11

    N

    Output 1

    Input N1

    N

    scheduler

    switching fabric

    Note: from now on, we always assume VOQ

    at the switch inputs

    input

    constraints

    output

    constraints

    Switching Architectures 2011/12

    48

    Outline

    IP routers

    OQ routers IQ routers Scheduling

    Optimal algorithms

    Heuristic algorithms

    Packet-mode algorithms

    Networks of routers

    QoS support

    CIOQ routers

    Multicast traffic

    Conclusions

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    17/55

    17

    49

    Scheduling in IQ switches

    Scheduling can be modeled as a matchingproblem in a bipartite graph the edge from node ito nodejrefers to packets

    at input iand directed to outputj

    the weight of the edge can be

    binary (not empty/empty queue)

    queue length

    HOL cell waiting time, or cell age

    some other metric indicating the priorityof the HOL cell to be served

    Switching Architectures 2011/12

    50

    Scheduling in IQ switches

    51

    2

    3

    4

    4

    2

    5

    4

    8

    2

    5

    4

    4

    8

    1

    2

    3

    4

    1

    2

    3

    4

    1

    2

    3

    4

    Graph Matching

    inputs outputs

    schedulerSwitching Architectures 2011/12

    51

    Implementing schedulers

    Scheduling is a complex task

    a scheduling algorithm can be implementedin hardware if:

    it shows good performance for a wide rangeof traffic patterns

    it can be efficiently parallelized

    it can be efficiently pipelined

    it requires few iterations (or clock cycles)

    it requires limited control information

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    18/55

    18

    52

    Scheduling uniform traffic

    A number of algorithms give 100%throughput when traffic is uniform

    For example:

    TDM and a few variants

    iSLIP (see later)

    RWP

    Example of TDM for a 4x4 switch

    Switching Architectures 2011/12

    53

    Scheduling non-uniform traffic

    If the traffic is known and admissible, 100%throughput can be achieved by a TDMusing: for a fraction of time a1 matching M1 for a fraction of time a2 matching M2

    for a fraction of time ak matching Mk subject toi ai = 1

    thanks to the Birkhoff - von Neumanntheorem

    Switching Architectures 2011/12

    54

    Outline

    IP routers

    OQ routers IQ routers Scheduling

    Optimal algorithms

    Heuristic algorithms

    Packet-mode algorithms

    Networks of routers

    QoS support

    CIOQ routers

    Multicast traffic

    Conclusions

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    19/55

    19

    55

    Maximum Weight Matching

    Maximum Weight Matching (MWM) among all the possible N! matchings, selects the one

    with the highest weight (sum of the edge metrics)

    MWM is generally not unique

    MWM is too complex to be implemented in hardwareat high speed

    the best MWM algorithm requires O(N3) iterations,

    and cannot be implemented efficiently, sinceit is based on a flow augmentation path algorithm

    cannot be parallelized and pipelined efficiently

    MWM has never been implemented in a commercial

    chipsetSwitching Architectures 2011/12

    56

    Maximum Weight Matching

    MWM is the optimal solution of the schedulingproblem when the traffic is unknown, when theweight is either the queue length or the cell age

    achieves 100% throughput under any traffic

    also under non-Bernoulli arrival processes,

    satisfying the law of large numbers

    achieves low average delays, very close to thoseof OQ switches

    possible starvation for lightly loaded packet flows

    Switching Architectures 2011/12

    57

    MWM with pipeline and latency

    Let T and P be fixed

    Dt denotes the matching used at time t

    The following variations of MWM also achieve100% throughput:

    Dt = MWM(t-P) MWM with pipeline degree P

    Dt = MWM(ceil(t/T)T) MWM with latency T

    combinations of both

    thus, it seems easy to achieve 100% throughput!

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    20/55

    20

    58

    MWM with pipeline and latency

    But: What about throughput?

    100% throughput

    but needs the computation of a MWM

    What about delays?

    delays can be really bad!

    Switching Architectures 2011/12

    59

    General consideration

    When scheduling in IQ switches, it isvery difficultto achieve simultaneously

    high throughput

    low delay

    limited implementation complexity

    Switching Architectures 2011/12

    60

    Maximum Size Matching

    Maximum Size Matching (MSM)

    among all the possible matchings, selects the one

    with the highest number of edges (like MWM with

    binary edge weights)

    MSM is generally not unique

    the best MSM algorithm requires O(N2.5) iterations,

    and cannot be implemented efficiently, since it isbased on a flow augmentation path algorithm

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    21/55

    21

    61

    Maximum Size Matching

    MSM maximizes the instantaneousthroughput

    MSM may not yield 100% throughput

    short term decisions can be inefficientin the long term

    non-binary edge weights allow MWMto maximize the long-term throughput

    Switching Architectures 2011/12

    62

    Instability of MSM

    Assume:

    P(arrival at Q12) =

    P(arrival at Q11) = P(arrival at Q22) = 1--

    Q12 = B 0 Q11 = Q22 = 0 in case of parity serve Q11 and/or Q22 instead of Q12

    Observe:

    Q12 is served only when A11 = 0 and A22 = 0, i.e. with probab ility:

    P(serve Q12) = P(no arrivals at both Q11 and Q22 ) = [1-(1--)]2 = (+)2

    P(serve Q12) < P(arrival at Q12) if is small enough

    Example: = 0.5; = 0.1;P(serve Q12) = 0.36 In1

    In2

    Out1

    Out2

    1--

    1--

    Note: this proof is due to

    I.Keslassy and R.Zhang,Stanford Univ.

    Switching Architectures 2011/12

    63

    Uniform traffic

    MWM and MSM behave almost identically

    1

    10

    100

    0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

    Meandelay

    Normalized Load

    Uniform Traffic

    MWMMSM

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    22/55

    22

    64

    LogDiagonal traffic

    MSM is somewhat inferior to MWM

    1

    10

    100

    1000

    0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

    Meandelay

    Normalized Load

    LogDiagonal Traffic

    MWMMSM

    Switching Architectures 2011/12

    65

    Diagonal traffic

    MSM yields much longer delays than MWM at medium/high loads

    1

    10

    100

    1000

    0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

    Meandelay

    Normalized Load

    Diagonal Traffic

    MWMMSM

    Switching Architectures 2011/12

    66

    Outline

    IP routers

    OQ routers IQ routers Scheduling

    Optimal algorithms

    Heuristic algorithms

    Packet-mode algorithms

    Networks of routers

    QoS support

    CIOQ routers

    Multicast traffic

    Conclusions

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    23/55

    23

    67

    Approximations of MSM and MWM

    Motivation strong interest in scheduling algorithms with

    very low complexity

    high performance

    Usually implementable schedulers (low complexity)

    low throughput, long delays

    theoretical schedulers (high complexity)

    high throughput, short delays

    Switching Architectures 2011/12

    68

    Some implementable algorithms

    Approximate MSM WFA, iSLIP, 2DRR, RC, FIRM and many

    others

    Approximate MWM with wij = Xij (queue length) iLQF, RPA, learning algorithms

    Approximate MWM with wij = cell age iOCF

    Approximate MWM with wij = i Xij+ j Xij iLPF, MUCS

    Switching Architectures 2011/12

    69

    APPROXIMATIONS OFMAXIMUM SIZE

    MATCHING

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    24/55

    24

    70

    Wave Front Arbiter

    Requests Match

    1

    2

    3

    4

    1

    2

    3

    4

    1

    2

    3

    4

    1

    2

    3

    4

    RWPSwitching Architectures 2011/12

    71

    Wave Front Arbiter

    Requests Match

    RWP

    2N-1 steps

    Switching Architectures 2011/12

    72

    Wrapped Wave Front Arbiter

    Requests Match

    N steps instead of

    2N-1

    RWPSwitching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    25/55

    25

    73

    iSLIP

    iSLIP means iterative SLIP iterates among the following 3 phases Request

    Grant

    Accept

    Switching Architectures 2011/12

    74

    iSLIP

    iSLIP demo

    from: http://tiny-tera.stanford.edu/tiny-tera/demos/index.html

    Switching Architectures 2011/12

    75

    iSLIP

    3 phases: Request (from inputs to outputs)

    each unmatched input sends a requestto every output for which it has a cell

    Grant (from outputs to inputs)

    if an unmatched output receives requests,it sends a grant to one of the inputs contentions solved by a round-robin mechanism

    Accept (from inputs to outputs)

    if an unmatched input receives grants, it selectsa single output and it becomes matched to it contentions solved by a round-robin mechanism

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    26/55

    26

    76

    iSLIP

    The round robin mechanism in iSLIP isdesigned so that, under uniform traffic,iSLIP emulates a dynamic TDM schedulersynchronized on the arrival pattern

    Switching Architectures 2011/12

    77

    iSLIP

    iSLIP is maximal

    often, with log N iterations

    always, with N iterations

    iSLIP was implemented on a single chipin the Cisco 12000 router http://www.cisco.com/warp/public/cc/pd/rt/12000/tech/fasts_wp.pdf

    Switching Architectures 2011/12

    78

    APPROXIMATIONS OFMAXIMUM WEIGHT

    MATCHING

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    27/55

    27

    79

    iLQF

    iLQF means iterative Longest Queue First iterates among the following 3 phases Request

    Grant

    Accept

    Switching Architectures 2011/12

    80

    iLQF

    iLQF demo

    from: http://tiny-tera.stanford.edu/tiny-tera/demos/index.html

    Switching Architectures 2011/12

    81

    iLQF

    3 phases: Request (from inputs to outputs)

    each unmatched input sends all its queue lengthsas requests to corresponding outputs

    Grant (from outputs to inputs)

    if an unmatched output receives requests, it sendsa grant to the input corresponding to the longest queue

    contentions solved by random choice

    Accept (from inputs to outputs)

    if an unmatched input receives grants, it selects

    the output with the longest queue contentions solved by random choice

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    28/55

    28

    82

    iLQF

    iLQF is maximal often, with log N iterations

    always, with N iterations

    iLQF is robust to non-uniform traffic

    Switching Architectures 2011/12

    83

    RPA

    RPA means Reservation with Preemptionand Acknowledgment

    Two phases Reservation (possibly preemptive)

    Acknowledgement

    Sequential accesses to a reservation vector Urgj (if set) is the urgency of the transfer from

    input Inj to output j

    Urg1,In1 Urg2,In2 Urg3,In3 UrgN,InN

    Out 1 Out 2 Out 3 Out N

    VectorRes

    Switching Architectures 2011/12

    RPA

    Vector Res is sequentiallyaccessed by all inputs

    Switching Architectures 2011/12 84

    Res

    Input 1 Input 2

    Input 4 Input 3

  • 7/30/2019 Packet Switching Eng 3pp

    29/55

    29

    85

    RPA

    Initially, at each round: Urgj = 0 for all j

    Reservation phase

    when input i accesses Res it computes Wj= Xij Urgj for all j

    finds j* such that Wj* = max{ Wj }

    if Wj* > 0,

    reserve output j* and set Urgj*=Xij*, possiblyoverwriting the previous reservation

    otherwise,

    leave the current reservation

    Switching Architectures 2011/12

    86

    RPA

    Acknowledgement phase

    if input i still finds its reservation at output j,

    books output j

    otherwise,

    chooses an unreserved output j and books

    output j

    Switching Architectures 2011/12

    87

    Uniform traffic

    comparison between MWM, iSLIP, iLQF, and RPA

    1

    10

    100

    1000

    0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

    Meandelay

    Normalized Load

    Uniform Traffic

    MWM

    iSLIP

    iLQF

    RPA

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    30/55

    30

    88

    LogDiagonal traffic

    iSLIP saturates close to 84% throughput

    1

    10

    100

    1000

    10000

    100000

    0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

    Meandelay

    Normalized Load

    LogDiagonal Traffic

    MWMiSLIPiLQFRPA

    Switching Architectures 2011/12

    89

    Diagonal traffic

    RPA achieves 98% throughput, iLQF 87%, iSLIP 83%

    1

    10

    100

    1000

    10000

    100000

    0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

    Meandelay

    Normalized Load

    Diagonal Traffic

    MWMiSLIPiLQFRPA

    Switching Architectures 2011/12

    90

    LEARNING ALGORITMS

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    31/55

    31

    91

    Learning algorithms

    Goal:

    find a good compromise among

    throughput, delayand complexity

    Switching Architectures 2011/12

    92

    Learning algorithms

    Key observation the matchings generated by MWM show limited

    changes from one time slot to another

    remembering the matching from the pastsimplifies the computation of the new matching

    the search implemented by MWM can beenhanced

    with a randomized approach

    by observing arrivals

    by searching in parallel

    based on an extension of randomizedscheduling algorithms

    Switching Architectures 2011/12

    93

    Simple Randomized Schemes

    Choose a matching at random and use itas the schedule doesnt yield 100% throughput

    Choose 2 matchings at random and usethe heavier one as the schedule

    Choose N matchings at random and usethe heaviest one as the schedule

    None of these can give 100% throughput !

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    32/55

    32

    94

    0.001

    0.01

    0.1

    1

    10

    100

    1000

    10000

    0.0 0.2 0.4 0.6 0.8 1.0

    MeanIQLen

    Normalized Load

    Diagonal Traffic

    MWMR32R1

    Simple randomized algorithms

    32x32

    Switching Architectures 2011/12

    95

    Bounds on Maximum Throughput

    by Devavrat Shah, StanfordUniversity

    Switching Architectures 2011/12

    96

    Tassiulas scheme

    Consider the following policy Rt = matching picked at random (uniformly) among

    all the possible N! matchings

    Dt = arg max { W(Dt-1), W(Rt) }

    Complexity is very low O(1) iterations

    easy to pipeline

    Yields 100% throughput ! note the boost in throughput is due to memory

    of the past matching Dt-1

    However, delays are very large

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    33/55

    33

    97

    0.01

    0.1

    1

    10

    100

    1000

    10000

    0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

    MeanIQLen

    Normalized Load

    Diagonal Traffic

    MWMTassiulas

    Tassiulas' scheme

    32x32

    Switching Architectures 2011/12

    98

    Learning approach

    Properties of COMP1

    W(Dt) W(Dt-1)

    W(Dt) W(Mt)

    Examples:

    COMP1 is the MAX amongDt-1 and Mt

    COMP1 is the MERGEamong Dt-1 and Mt

    Dt-1

    Dt

    COMP1

    Mt

    Switching Architectures 2011/12

    99

    The learning approach

    Dt-1

    Dt

    COMP1

    Mt

    Properties of Mt informally, Mt should be a good sample

    in the space of all possible matchings

    Examples:

    Mt is a matching picked uniformly at

    random

    Mt is derived from the arrival vector At

    Mt is a good neighbor of D t-1

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    34/55

    34

    100

    Theoretical properties

    Dt-1

    Dt

    COMP1

    Mt

    Stability 100% throughput under any

    admissible Bernoulli trafficpattern

    Delay the better is the weight of Mt ,

    the smaller are the queuelengths, and hence the smallerare the delays

    Switching Architectures 2011/12

    101

    Dt-1

    Mt

    Dt

    MAX

    MAX

    N1 NK

    At

    K-th neighborof Dt-1

    Example of practical implementation

    Exploiting parallel search:

    This scheme is calledAPSARA

    Switching Architectures 2011/12

    102

    What is a neighbor of a matching?

    Each neighbor differs from Dt-1 in ONLY TWO edges can be generated very easily in hardware

    3 neighbors

    Example: 3 x 3 switchDt-1

    N1 N2 N3

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    35/55

    35

    103

    Max-APSARA

    APSARA, as described before, is notmaximal

    Max-APSARA is a modified version ofAPSARA where a maximal size matchingalgorithm runs on the remaining unmatchedinputs/outputs e.g., if k inputs/outputs are unmatched,

    run iSLIP with k iterations

    select k random edges among theunmatched inputs/outputs

    Switching Architectures 2011/12

    104

    APSARA performance

    0.01

    0.1

    1

    10

    100

    1000

    10000

    0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

    Me

    anIQL

    ength

    Normalized Load

    Diagonal Traffic

    MWMMaxAPSARAAPSARAiSLIPiLQF

    Switching Architectures 2011/12

    105

    Outline

    IP routers

    OQ routers IQ routers Scheduling

    Optimal algorithms

    Heuristic algorithms

    Packet-mode algorithms

    Networks of routers

    QoS support

    CIOQ routers

    Multicast traffic

    Conclusions

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    36/55

    36

    106

    Routers and switches

    IP routers deal with variable-size packets

    Hardware switching fabrics often dealwith fixed-size cells

    Question:

    how to integrate an hardware switching fabricwithin an IP router?

    Switching Architectures 2011/12

    107

    Router based on an IQ cell switch:cell-mode

    switching fabric

    IQ cell switch1 ISM

    N ISM

    ORM1

    ORMN

    Switching Architectures 2011/12

    108

    Cell-modescheduling

    Scheduling algorithms work at cell level

    pros:

    100% throughput achievable

    cons:

    interleaving of packets at the outputsof the switching fabric

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    37/55

    37

    109

    Router based on an IQ cell switch:packet-mode

    switching fabric

    IQ cell switch1 ISM

    N ISM

    ORM1

    ORMN

    NO packetinterleaving

    ifpacket-mode

    Switching Architectures 2011/12

    110

    Router based on an IQ cell switch:packet-mode

    switching fabric

    IQ cell switch1 ISM

    N ISM

    ORM1

    ORMN

    NO packetinterleaving

    ifpacket-mode

    ORMs can be

    removed

    Switching Architectures 2011/12

    111

    Packet-modescheduling

    Rule: packets transferred as trains of cells when an input starts transferring the first cell

    of a packet comprising k cells, it continuesto transfer in the following k-1 time slots

    Pros: no interleaving of packets at the outputs

    easy extension of traditional schedulers

    Cons: starvation due to long packets

    inherent in packet systems without preemption

    negligible for high speed rates

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    38/55

    38

    112

    Packet-modescheduling

    Questions

    can packet mode provide highthroughput?

    what about delays?

    YES!

    It depends

    Switching Architectures 2011/12

    113

    Packet-modeproperties

    Main theoretical results

    MWM in packet-modeyields 100% throughput

    Packet modecan provide shorter delaysthan cell mode, depending on thepacket length distribution

    Switching Architectures 2011/12

    114

    Simulation scenario

    Router with ISMs and ORMs

    Uniform packet traffic uniform packet load

    uniform (1,192) packet sizedistribution

    Spotted packet traffic non uniform packet load

    bimodal (3,100) packet sizedistribution

    1 1 1 0 1 0 1 0

    0 1 0 1 1 1 0 1

    1 0 1 0 1 1 1 0

    1 1 0 1 0 1 0 1

    1 0 1 0 1 0 1 1

    0 1 0 1 0 1 1 1

    1 0 1 1 1 0 1 0

    0 1 1 1 0 1 0 1

    P=

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    39/55

  • 7/30/2019 Packet Switching Eng 3pp

    40/55

    40

    118

    Packet mode features

    Packet mode scheduling is a feasible modification of schedulers

    improves throughput

    but it can generate some unfairness betweenlong and short packets

    inherent to all variable-packet networks withoutpreemption

    maygive better packet delays than cell mode

    depends on the packet size distribution

    Switching Architectures 2011/12

    119

    Outline

    IP routers

    OQ routers IQ routers Scheduling

    Optimal algorithms

    Heuristic algorithms

    Packet-mode algorithms

    Networks of routers QoS support

    CIOQ routers

    Multicast traffic

    Conclusions

    Switching Architectures 2011/12

    120

    Network of IQ routers

    Question:

    given a network of IQ switches running MWMand an admissible input traffic, is the networkalways stable?

    NO!

    this is quitecounterintuitivebut true

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    41/55

    41

    121

    Networks of IQ routers

    Consider the acyclic network of IQ routersin the following slide

    derived from well established resultsfrom adversarial queueing theory

    a very specific scenario, but comprisesonly few switches

    this situation may not be common,but cannot be excluded in real networks

    Switching Architectures 2011/12

    122

    Pathological network of IQ switches

    Network with

    8 switchesand 4 flows

    Switching Architectures 2011/12

    123

    Instability of MWM

    If MWM is adopted at each IQ router, andthe traffic is admissible, the system can beunstable under Bernoulli i.i.d. arrivals

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    42/55

    42

    124

    Instability of MWM

    MWM is too greedy, in the sense that itcan create traffic bursts that are amplifiedby each scheduler

    A server can be idling when large bursts(directed to it) are blocked becauseof the contentions upstream

    the problem arises when a packet flow issubject to priority changes along its paththrough the network

    Switching Architectures 2011/12

    125

    Stability in networks of routers

    Global policies

    Oldest in the network and many others

    problem: requires global information aboutthe network, and synchronized clocks at theingress of the network

    Switching Architectures 2011/12

    126

    Stability in networks of routers

    Semi-local policies MWM with local information about the router

    neighbors can achieves 100% throughputunder i.i.d. Bernoulli arrivals

    Virtual network queue

    the weights used by MWM are: wij = max{0,Xij-Xdown-queue(ij))}

    where down-queue(ij) is the first downstreamqueue which is receiving packets from Xij

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    43/55

    43

    127

    Outline

    IP routers

    OQ routers

    IQ routers Scheduling

    Optimal algorithms

    Heuristic algorithms

    Packet-mode algorithms

    Networks of routers

    QoS support

    CIOQ routers

    Multicast traffic

    Conclusions

    Switching Architectures 2011/12

    128

    IQ and QoS

    Problem:

    support rate guarantees, with admissible ratematrix

    Switching Architectures 2011/12

    129

    Birkhoff von Neumann

    decomposition

    goal: find a sequence of matchings Mk andtheir fraction of time k such that the servicegiven to all the queues satisfies R

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    44/55

    44

    130

    IQ and frame scheduling

    Example:

    M1

    1=1/4, 2=1/2, 3=1/4

    M2 M2 M3 M1 M2 M2 M3

    frame i frame i+1

    Switching Architectures 2011/12

    131

    How to decompose R?

    R double substochastic

    R double stochastic such that RR

    R decomposition

    augmentation

    algorithm

    BvN algorithm

    Switching Architectures 2011/12

    132

    Augmentation algorithm

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    45/55

    45

    133

    BvN algorithm

    Switching Architectures 2011/12

    134

    BvN algorithm

    Switching Architectures 2011/12

    135

    Outline

    IP routers

    OQ routers IQ routers Scheduling

    Optimal algorithms

    Heuristic algorithms

    Packet-mode algorithms

    Networks of routers

    QoS support

    CIOQ routers

    Multicast traffic

    Conclusions

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    46/55

    46

    136

    CIOQ routers

    Output 1

    switching fabric

    Output N

    S

    S

    o1

    oN

    Input 1S

    Input NS

    VOQ

    Switching Architectures 2011/12

    137

    CIOQ routers

    Question:

    if a low speedup S is allowed (and queuesare available at both inputs and outputs),is it possible to design simple schedulingalgorithms, capable of achieving high

    throughput and low delay?

    YES!

    Switching Architectures 2011/12

    138

    OQ emulation

    a CIOQ switch achieves perfect OQemulationif the departure order of all thepackets from each output is the same asthe emulated OQ it is impossible to distinguish, by observing

    arrivals and departures, if the switchingarchitecture is CIOQ or OQ

    delays are perfectly controlled

    easy to implement scheduling algorithmsborn for OQ (eg: WFQ)

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    47/55

    47

    139

    Work conservation

    a CIOQ switch is work-conserving when eachoutput is busy at the same time as thecorresponding OQ switch i.e., each output of the switch for which there are cells

    (either at the inputs or at the outputs) at the beginningof cell slot T is active at the end of the cell slot T

    output never idling whenever a packet is presentdestined to it

    good delay performance: same average delays as OQ

    note that OQ emulation implies work conservationbut not viceversa

    Switching Architectures 2011/12

    140

    Speedup and performance

    speedup 4

    exact OQ emulation

    speedup 2

    exact OQ emulation

    work conservation

    same average delay than OQ

    Switching Architectures 2011/12

    141

    CIOQ routers with S=2

    If S = 2

    easy to obtain 100% throughput

    any maximal matching obtains 100%throughput

    less easy to obtain work conservation

    LOOFA algorithm

    it is difficult to obtain perfect OQ emulation

    stable marriage algorithm with specialpreference list

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    48/55

    48

    142

    LOOFA

    Occupancy oj : number of cells currentlyresiding at the j-th output queue

    at each time slot, oj is decremented by onebecause of departures

    Basic idea of LOOFA

    Higher priority is given to outputs with loweroccupancy, thereby attempting to maintainwork-conservation for all outputs

    Switching Architectures 2011/12

    143

    LOOFA

    If S = 2, during eachof the two phases

    each unmatched input selects a non-emptyVOQ directed to the unmatched output withthe lowest occupancy, and sends a requestto that output

    each unmatched output grants one request the selection can be round robin, random, ...

    repeat until the matching is maximal

    Switching Architectures 2011/12

    144

    LOOFA with S=2

    TEO:

    LOOFA achieves work conservation if S = 2

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    49/55

    49

    145

    OQ emulation with S=4

    urgency of a cell=departure time in OQ-current time

    MUCF (Most urgent cell first)

    During each phase:

    1. outputs request their most urgent cells from inputs

    2. input grants output with the most urgent cell

    3. loser output tries to obtain their next urgent cell

    4. when no more matchings are possible, cells aretransferred and the next phase starts

    Switching Architectures 2011/12

    146

    OQ emulation with S=4

    Note: picture reproduced from Balaji Prabhakar and Nick McKeown, "On the Speedup Required f orCombined Input and Output Queued Swi tching.", Computer Systems Technical Report, November 1997

    Switching Architectures 2011/12

    147

    OQ emulation and speedup 4

    TEO:

    MUCF with speedup 4 obtains OQ emulation

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    50/55

    50

    148

    CIOQ routers

    CIOQ are very promising architectures many degrees of freedom in design

    how to balance input/output buffers

    how the buffers interact

    e.g., by backpressure mechanisms

    Several currently designed architectures aresupposed to be CIOQ

    Speedup S is becoming closer and closer to 1 inpractical implementations of new switchingarchitectures (CIOQ IQ)

    Switching Architectures 2011/12

    149

    Outline

    IP routers

    OQ routers IQ routers Scheduling

    Optimal algorithms

    Heuristic algorithms

    Packet-mode algorithms

    Networks of routers QoS support

    CIOQ routers

    Multicast traffic

    Conclusions

    Switching Architectures 2011/12

    150

    Multicast traffic

    Misleading idea:

    observe

    1. OQ can achieve 100% throughput underany admissible unicast and multicast traffic

    2. OQ can be perfectly emulated by CIOQwith S = 2

    then, with S = 2 it is possible to achieve100% throughput for multicast traffic

    WRONG!

    because observation 2 holds only for unicast trafficSwitching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    51/55

    51

    151

    Multicast traffic

    Question: what is the minimum speedup required

    to achieve 100% throughput?

    unknown!

    Switching Architectures 2011/12

    152

    Multicast traffic

    Possible implementations

    copy network before the switching fabric

    a multicast cell with f destinations is treated as f cells

    possible bandwidth inefficiency

    dedicated queue

    multicast packets are treated in some specific way1

    UC

    MC

    N

    N N

    UC+MC

    N N

    Switching Architectures 2011/12

    153

    Multicast traffic: optimal queueing

    MC-VOQ queueing

    best throughput performance

    avoids HOL blocking

    2N-1 queues for each input, one for each fanout set

    re-enqueuing process out-of-sequence problem

    no re-enqueuing some throughput degradation

    MC+UC

    1

    2N-1

    N NSwitching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    52/55

    52

    154

    Multicast traffic: optimal scheduling

    The optimal scheduling for multicast trafficcan be defined similarly to unicast traffic

    it is a sort of max flow algorithm on all N(2N-1)queues

    Many heuristics can be envisagedto approximate it

    Switching Architectures 2011/12

    155

    Summary

    3 main ingredients for IQ schedulingalgorithms: Weight computation

    Matching computation

    Contention resolution

    Switching Architectures 2011/12

    156

    Summary

    Weight computation obtains the priority of each input queue

    the metric can be related to queue length,waiting time of the cell at the HOL,

    Contention resolution

    whenever the selection is among situationswith equal weights

    can be round robin, or random

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    53/55

    53

    157

    Summary

    Matching computation computes the matching, trying to maximize

    its total weight

    can be based on

    an iterative search, like in iSLIP, iOCF,iLQF

    a matrix greedy approach, like in MUCS,WFA

    a reservation vector, like in RPA

    a learning approach, like in APSARA

    Switching Architectures 2011/12

    158

    Summary

    Good IQ scheduling algorithms exist: 100% throughput

    short delay

    limited complexity

    Performance differences are significant

    only close to saturation

    Switching Architectures 2011/12

    159

    Summary

    Open questions concerning IQschedulers: QoS guarantees

    stability of networks of switches

    multicast traffic

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    54/55

    54

    160

    ReferencesRouter functions and architectures

    Keshav S., Sharma R., ``Issues and trends in router design'', IEEE Communications Magazine, vol.36, n.5,May 1998, p.144-151

    Bux W., Denzel W.E., Engbersen T., Herkersdorf A., Luijten R.P.,``Technologies and building blocks for fastpacket forwarding'', IEEE Communications Magazine, Jan.2001, pp.70-77

    Newman P., Minshall G., Lyon T., Huston L.,``IP switching and gigabit routers'', IEEE CommunicationsMagazine, Jan.1997, pp.64-69

    Wolf T., Turner J.S., `Design issues for high-performance active routers'', IEEE Journal on Selected Areas inCommunications, vol.19, n.3, Mar.2001, pp.404-409

    Scheduling in IQ switches

    Karol M., Hluchyj M., Morgan S., `Input versus output queueing on a space division switch'', IEEETransactions on Communications, vol.35, n.12, Dec.1987

    McKeown N., Anantharam V., Walrand J.,``Achieving 100\% throughput in an input-queued switch'',IEEEINFOCOM'96, vol.1, San Francisco, CA, Mar.1996, pp.296-302

    McKeown N.,``iSLIP: a scheduling algorithm for input-queued switches'', IEEE Transactions on Networking,vol.7, n.2, Apr.1999, pp.188-201

    McKeown N., Mekkittikul A.,``A practical scheduling algorithm to achieve 100\% throughput in input-queuedswitches'', IEEE INFOCOM'98, vol.2, 1998, pp.792-9, New York, NY

    Tamir Y., Chi H.-C., ``Symmetric crossbar arbiters for VLSI communication switches'', IEEE Transaction onParallel and Distributed Systems, vol.4, no.1, Jan.1993, pp.13 27

    Chen H., Lambert J., Pitsilledes A.,``RC-BB switch. A high performance switching network for B-ISDN'',GLOBECOM 95

    Switching Architectures 2011/12

    161

    ReferencesScheduling in IQ switches

    Anderson T., Owicki S., Saxe J., Thacker C.,``High speed switch scheduling for local area networks'', ACMTransactions on Computer Systems, vol.11, n.4, Nov.1993

    LaMaire R.O., Serpanos D.N., `Two dimensional round-robin schedulers for packet switches with multipleinput queues'', IEEE/ACM Transaction on Networking, vol.2, n.5, Oct.1994, p.471-482

    Chen H., Lambert J., Pitsilledes A., ``RC-BB switch. A h igh performance switching network for B-ISDN'', IEEEGLOBECOM 95, 1995

    Duan H., Lockwood J.W., Kang S.M., Will J.D., `A high performance OC12/OC48 queue design prototype forinput buffered ATM switches'', IEEE INFOCOM'97, vol.1, 1997, pp.20-8, Los Alamitos, CA

    Partridge C., et al., `A 50-Gb/s IP router'', IEEE Transactions on Networking, vol.6, n.3, June 1998, pp.237-248

    Ajmone Marsan M., Bianco A., Leonard i E., Milia L., ``RPA: a flexible sched uling algorithm for input buffere dswitches'', IEEE Transactions on Communications, vol.47, n.12, Dec.1999, pp.1921-1933

    Ajmone Marsan M., Bianco A., Filippi E., Giaccone P.,Leonardi E., Neri F.,``On the behavior of input queueing

    switch architectures'', European Transactions on Telecommunications, vol.10, n.2, Mar.1999, pp.111-124 Christensen K.J.,``Design and evaluation of a parallel-polled virtual o utput queued switch'', IEEE ICC 2001,

    vol.1, pp.112-116, 2001

    Serpanos D.N., Antoniadis P.I., ``FIRM: a class of distributed scheduling algorithms for high-speed ATMswitches with multiple input queues'', IEEE INFOCOM 2000, vol.2, pp.548-555, 2000

    Ying Jiang, Hamdi, M., A 2-stage matching scheduler for a VOQ packet switch architecture, IEEE ICC 2002,vol.4, pp.2105-2110, 2002

    Tassiulas L., ``Linear complexity algorithms for maximum throughput in radio networks and input queuedswitches'', IEEE INFOCOM'98, vol.2, New York, NY, 1998, pp.533-539

    Giaccone P., Prabhakar B., Shah D., `Towards simple, high-performance schedulers for high-aggregatebandwidth switches '', IEEE INFOCOM'02, New York, Jun.2002Switching Architectures 2011/12

    162

    ReferencesPacket scheduling in IQ switches

    Ajmone Marsan M., Bianco A., Giaccone P., Leonardi E., Neri F., ``Pack et scheduling in input-q ueued cell-based switches'', IEEE INFOCOM'01, Anchorage, Alaska, Apr.2001(extended version to appear in IEEETrans. on Networking, about Oct.2002)

    Moon S.H., Sung D.K., `High-performance variable-length packet scheduling algorithm for IP traffic'', IEEEGLOBECOM'01, Dec.2001

    Scheduling multicast traffic in IQ switches

    Hayes J.F., Breault R., Mehmet-Ali M.K., ``Performance analysis of a multicast switch'', IEEE Transactions onCommunications, vol.39, n.4, Apr.1991, pp.581-587

    Kim C.K., Lee T.T., ``Call scheduling algorithm in multicast switching systems'', IEEE Transactions onCommunications, vol.40, n.3, Mar.1992, pp.625-635

    McKeown N., Prabhakar B., ``Scheduling multicast cells in an input-queued switch'', INFOCOM'96, vol.1, SanFrancisco, CA, Mar.1996, pp.261-278

    Prabhakar B., McKeown N., Ahuja R., ``Multicast scheduling for input-queued switches'', IEEE Journal onSelected Areas in Communications, vol.15, n.5, Jun.1997, pp.855-866

    Chen W., Chang Y., Hwang W., ``A high performance cell scheduling algorithm in broadband multicastswitching systems'', IEEE GLOBECOM'97, vol.1, New York, NY, 1997, pp.170-174

    Guo M., Chang R., ``Multicast ATM switches: survey and performance evaluation'', Computer CommunicationReview, vol.28, n.2, Apr.1998, pp.98-131

    Andrews M., Khanna S., Kumaran K., ``Integrated scheduling of unicast and multicast traffic in an input-queued switch'', IEEE INFOCOM'99, vol.3, New York, NY, 1999, pp.1144-1151

    Liu Z., Righter R., `Scheduling multicast input-queued switches'', Journal of Scheduling, John W iley & Sons,May 1999

    Switching Architectures 2011/12

  • 7/30/2019 Packet Switching Eng 3pp

    55/55

    163

    ReferencesScheduling multicast traffic in IQ switches

    Nong G., Hamdi M., ``On the provision of integrated QoS guarantees of unicast and multicast traffic in input-queued switches'', IEEE GLOBECOM'99, vol.3, 1999

    Ajmone Marsan M., Bianco A., Giaccone P., Leonardi E., Neri F., ``On the throughput of input-queued cell-based switches with multicast traffic'', IEEE INFOCOM'01, Anchorage Alaska, Apr.2001

    Ge Nong, Hamdi M., Providing QoS guarantees for unicast/multicast traffic with fixed/variable-length packetsin multiple input-queued switches, IEEE Symposium on Computers and Communications, pp.166 171, 2001

    Smiljanic A., Flexible bandwidth allocation in high-capacity packet switches, IEEE/ACM Transactions onNetworking, vol.10, n.2, pp.287-293, Apr.2002

    QoS support in IQ switches

    Tabatabaee V., Georgiadis L., Tassiulas L., ``QoS provisionin g and tracking fluid policies in input queueingswitches'', IEEE INFOCOM'00, New York, Mar.2000

    Chang C.S., Lee D.S., Jou Y.S., `Load balanced Birkhoff-von Neumann switches'', 2001 IEEE Workshop onHigh Performance Switching and Routing, 2001, pp.276-280.

    Hung A., Kesidis G., McKeown N.,``ATM input-buffered switches with guaranteed-rate property'', IEEEISCC'98, July 1998, pp.331-335, Athens, Greece

    Advanced architectures derived from pure IQ

    Iyer S., McKeown N., ``Making parallel packet switches practical'', IEEE INFOCOM'01, Alaska, Mar.2001

    Chang C.S., Lee D.S., Jou Y.S., `Load balanced Birkhoff-von Neumann switches'', 2001 IEEE Workshop onHigh Performance Switching and Routing, 2001, pp.276-280

    Sivaram R., Stunkel C.B., Panda D.K., HIPIQS: a high-performance switch architecture using input queuing,IEEE Transactions on Parallel and Distributed Systems, vol.13, n.3, pp.275-289, Mar.2002

    Switching Architectures 2011/12

    164

    ReferencesScheduling in networks of IQ switches

    L.Tassiulas, A.Ephremides,``Stability properties of constrained queueing systems and scheduling policies formaximum throughput in multihop radio networks'',IEEE Transactions on Automatic Control,vol.37, n.12,Dec.1992, pp.1936-1948

    M.Andrews, L.Zhang,``Achieving Stability in Networks of Input-Queued Switches'',IEEE INFOCOM 2001,Anchorage, Alaska, Apr.2001, pp.1673-1679

    M.Ajmone Marsan, E.Leonardi, M.Mellia, F.Neri,``On the Throughput Achievable by Isolated andInterconnected Input-Queued Switches under Multicass Traffic'',IEEE INFOCOM 2002, New York, NY (USA),June 2002

    M. Ajmone Marsan, P. Giaccone, E. Leonardi, F. Neri,``On the Stability of Local Scheduling Policies inNetworks of Packet Switches with Input Queues'', IEEE JSAC, to appear, 2003

    Switching Architectures 2011/12