Top Banner
Lookahead Packet Scheduling Algorithm for CIOQ DataCenter Switch Fabrics EE384Y Packet Switch Architectures II April 18 2003 Deepak Kakadia, Student ID: 4358289 Introduction A Typical Datacenter N-Tier network architecture is shown below in Fig. 1. Current technology DataCenter Net- working Equipment building blocks are composed of established high volume, optimized layer 2 and layer 3 packet switches and relatively new, mostly startup produced appliances for more complex ip services such as ssl, xml, url switching, nat etc. The appliances have evolved from functions that were previously performed on general purpose computers The next logical evolution step is to converge and integrate these 2 product families to produce a device that is cost effective and optimized particularly for the data center traffic patterns . We know some things about the traffic pat- terns, which can be exploited to increase throughput by grooming traffic to avoid future contention on input and out- put resources in the switch fabric, keeping queues more evenly loaded over time. The Datacenter Edge traffic flows have some properties which can be used to allow us to make better scheduling decisions in order to increase overall throughput. Fig.1 Typical Datacenter Edge Network Architecture The distributed appliance and switch architecture requires islands of independent serial processing. For example, usually client traffic is first diverted to a NAT function, which then rewrites the packet and points to another ip device, such as a load balancer which again rewrites the packet. We can exploit this knowledge of the fixed set of ser- vices that are to be performed on this flow, to create one integrated device, perform 1 packet classification lookup, findout all the services to be performed, determine also in what sequence, then make more intelligent packetschedul- ing decisions, which proactively schedule the packet throught the fabric to prevent future congestion in the switch fabric. This is different from the first stage of the multistage switches of [7], where in this case we are making sched- uling descisions now, which will impact the arrival traffic in future time slots. Fig. 2 below describes an example architecture, where the services to be performed are directly connected to the bidirectional ports of a switch fabric. In a practical example, we would have an SSL accelerator asic such as CAVIUM Nitrox asic which has a flow thru architecture, where packets are recieved via a SPI4.2 interface in one port, and either encrypt or decrypted traffic emerges out the other port which is connected to the same switch fabric port but of opposite direction. Client 1 Client 2 Level 2-3 edge switch Web service module Sun Fire 6800 Sun Fire 6800 10.30.0.101 Sun Fire 6800 Sun Fire 6800 10.30.0.100 Client access Application service module Database service module Directory sevice module Master core 192.168.10.2 Sun Fire 280R Sun Fire 280R Sun Fire 280R Sun Fire 280R Sun Fire 280R Sun Fire 280R Sun Fire 280R Sun Fire 280R T3 T3 192.168.10.1 10.10.0.1 10.20.0.1 10.40.0.1 10.30.0.1 10.50.0.1 10.10.0.100 10.10.0.101 10.10.0.102 10.10.0.103 10.20.0.100 10.20.0.101 10.20.0.102 10.20.0.103 10.40.0.100 10.40.0.101 Standby core 192.168.10.3 Server load-balancer switches Foundry switches
2

Lookahead Packet Scheduling Algorithm for CIOQ DataCenter ...Lookahead Packet Scheduling Algorithm for CIOQ DataCenter Switch Fabrics EE384Y Packet Switch Architectures II April 18

Mar 15, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lookahead Packet Scheduling Algorithm for CIOQ DataCenter ...Lookahead Packet Scheduling Algorithm for CIOQ DataCenter Switch Fabrics EE384Y Packet Switch Architectures II April 18

Lookahead Packet Scheduling Algorithm for CIOQ DataCenter Switch FabricsEE384Y Packet Switch Architectures II

April 18 2003Deepak Kakadia, Student ID: 4358289

IntroductionA Typical Datacenter N-Tier network architecture is shown below in Fig. 1. Current technology DataCenter Net-working Equipment building blocks are composed of established high volume, optimized layer 2 and layer 3 packet switches and relatively new, mostly startup produced appliances for more complex ip services such as ssl, xml, url switching, nat etc. The appliances have evolved from functions that were previously performed on general purpose computers

The next logical evolution step is to converge and integrate these 2 product families to produce a device that is cost effective and optimized particularly for the data center traffic patterns . We know some things about the traffic pat-terns, which can be exploited to increase throughput by grooming traffic to avoid future contention on input and out-put resources in the switch fabric, keeping queues more evenly loaded over time. The Datacenter Edge traffic flows have some properties which can be used to allow us to make better scheduling decisions in order to increase overall throughput.

Fig.1 Typical Datacenter Edge Network Architecture

The distributed appliance and switch architecture requires islands of independent serial processing. For example, usually client traffic is first diverted to a NAT function, which then rewrites the packet and points to another ip device, such as a load balancer which again rewrites the packet. We can exploit this knowledge of the fixed set of ser-vices that are to be performed on this flow, to create one integrated device, perform 1 packet classification lookup, findout all the services to be performed, determine also in what sequence, then make more intelligent packetschedul-ing decisions, which proactively schedule the packet throught the fabric to prevent future congestion in the switch fabric. This is different from the first stage of the multistage switches of [7], where in this case we are making sched-uling descisions now, which will impact the arrival traffic in future time slots. Fig. 2 below describes an example architecture, where the services to be performed are directly connected to the bidirectional ports of a switch fabric. In a practical example, we would have an SSL accelerator asic such as CAVIUM Nitrox asic which has a flow thru architecture, where packets are recieved via a SPI4.2 interface in one port, and either encrypt or decrypted traffic emerges out the other port which is connected to the same switch fabric port but of opposite direction.

Client 1 Client 2

Level 2-3 edge switch

Webservicemodule

Sun Fire6800

Sun Fire680010.30.0.101

Sun Fire6800

Sun Fire6800

10.30.0.100

Clientaccess

Applicationservicemodule

Databaseservicemodule

Directorysevicemodule

Master core192.168.10.2

Sun Fire 280R Sun Fire 280R Sun Fire 280R Sun Fire 280R

Sun Fire 280R Sun Fire 280R Sun Fire 280R Sun Fire 280R

T3

T3

192.168.10.1

10.10.0.110.20.0.110.40.0.110.30.0.1

10.50.0.1

10.10.0.100 10.10.0.101 10.10.0.102 10.10.0.103

10.20.0.100 10.20.0.101 10.20.0.102 10.20.0.103

10.40.0.100 10.40.0.101

Standby core192.168.10.3

Server load-balancer switches

Foundryswitches

Page 2: Lookahead Packet Scheduling Algorithm for CIOQ DataCenter ...Lookahead Packet Scheduling Algorithm for CIOQ DataCenter Switch Fabrics EE384Y Packet Switch Architectures II April 18

Fig. 2 Integrated Services and switch fabric example architecture. Packet classifier determines apriori all services that are to be performed on packet and hence has more information to make better scheduling decisions to groom traffic that re enters the fabric.Fig. 3 below describes how the lookahead information, provided by the packetclassifier as a tag prepended to the packet header, can be used by the Lookahead based packet scheduler to make a better scheduling descision than Max-imum Weight Matching. Suppose Packet 1 would require services at port 4,5, and Packet 2 would require services at port 4 and 6. Port 5 already has packets queued up to keep it busy. If packet 1 is chosen before packet 2, then there is a chance of an idle port 6, whereas if packet 2 was chosen, there would be less chance of an idle port, hence increas-ing throughput.

Fig 3 Lookahead Algorithm over Maximum Weight Matching, using Lookahed Tag

Proposed Problem1- Describe the Lookahead packet scheduling algorithm and determine stability, throughput and average delay analysis2 - Perform simulations and compare with MWM and other related algorithms.

References[1] N.McKeown, M.Izzard, A. Mekkittikul, B.Ellersick and M.Horowitz,” The Tiny Tera: A Packet Switch Core”, Hot Interconnects V, Stanford University, August 1996[2] N.McKeown, V.Anantharam and J.Walrand, ‘Achieving 100% throughput in input queued switches”, IEEE INFOCOM ‘98 p.792-799 1998.[3] A.K Parek and R.G.Gallager, “ A Generalized Processor Sharing Approach to Flow Control in Integrated Services Packet Networks:The Single Node Case, IEEE/ACM Transactions on Networking, vol.1, No.3, pp.344-357, June 1993.[4] A.K Parek and R.G.Gallager, “ A Generalized Processor Sharing Approach to Flow Control in Integrated Services Packet Networks:The Multiple Node Case, IEEE/ACM Transactions on Networking, vol.2, No.2, pp.137-150, April 1994.[5] Shah D., Kopikare M. “Delay Bounds for approximate Maximum Weight Matching Algorithm for Input Queued Switches”, IEEE INFOCOM 2002, New York, NY June 23-27 2002 http://www.ieee-infocom.org/2002/technical_programs.htm.[6] Leonardi E., Mellia M., Neri F., Ajmone Marsan M., “Bounds on Average Delays and Queue Size Averages and Variances in Input Queued Cell Based Switches”, IEEE INFOCOM 2001, Alaska April 2001, pp1095-1103.[7] C.S Chang, e. al “Load balanced Birkhoff von Neuman switches, part II: Multistage buffering “ http://www.ee.nthu.tw/~cschang/PartII.ps.

1 2 k

k

1 1 1

11

Internal Tag

Pack

et Cl

assif

ier

1 2 3 nn+1 m

1

2

n

1 k

1 2 k1 2 n

VOQ1 VOQ1N

VOQ11

VOQN

VOQ2

VOQN1

VOQNN

VOQ21

VOQ2N

OQ1 OQ2 OQn

S1 S2 Sk

Port 1

Port 2

Port 3

Port 4

Port 5

Port 6

1 7

2

45

Port 1

Port 2

Port 3

Port 4

Port 5

Port 6

7 1

46 5

2Port 1

Port 2

Port 3

Port 4

Port 5

Port 6

7

6 5

T=1 T=2 T=4, Port 6 is busy

Port 1

Port 2

Port 3

Port 4

Port 5

Port 6

1 7

2

45

Port 1

Port 2

Port 3

Port 4

Port 5

Port 6

7

2

46 5

1Port 1

Port 2

Port 3

Port 4

Port 5

Port 6

7

1 6 5

T=1 T=2 T=4, Port 6 is idle

2

1

2

A) MWM SchedulesPacket 1 before Packet 2

Packet 2 before Packet 1

B) LA Schedules

Increased Thrughput