Top Banner
Titan: Fair Packet Scheduling for Commodity Multiqueue NICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift July 13 th , 2017
28

Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Oct 09, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Titan:FairPacketSchedulingforCommodityMultiqueue NICsBrentStephens,ArjunSinghvi,AdityaAkella,andMikeSwift

July13th,2017

Page 2: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Ethernetline-ratesareincreasing!

2

Page 3: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Serversneed:

3

Todriveincreasingline-rates

LowCPUutilizationnetworking

Page 4: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Underlyingmechanisms:

4

SegmentationOffload

Multiqueue NICs

Page 5: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Usinglargesegments(64KB)insteadofpacketscanreduceCPUload

5

F1F2F1F2

Wire

F1

F2

Wire

TCPSegmentationOffload(TSO)

• ManyoperationsperformedbytheOSareper-packet,notper-byte• TSOallowstheOStosendlargesegmentstotheNIC• TSONIChardwaregeneratespacketsfromsegments

Page 6: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Core2Core1

Multiqueue NICsenableparallelism6

Multiqueue NICs

TXQ-2TXQ-1

Wire

PacketScheduler

F1

F3

F2

F2Locking/Polling

Wire

Core1 Core2

F1F2F3

Page 7: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

FairnessProblems

7TSOandmultiqueue causepervasiveunfairness

Core2Core1TXQ-2TXQ-1

Wire

PacketScheduler

F1F3

F2F2

Wire

F1 F3 F2F1F2F2F2 F3TSO

unfairnessMultiqueueunfairness

WireF3

Fairpacket

schedule:

Actualpacket

schedule:F1 F2F1F3 F2F1F3 F2

Page 8: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Fairnessisimportant

8

• Fairnessisneededsocompetingapplicationscansharethenetwork

• Fairnessisneededforpredictability• Unfairnessleadstounpredictablecompletiontimesacrossruns• Perfectfairness→perfectpredictability

• Fairnesscanimproveapplicationperformance• Ex:WeightedCoflow Scheduling

• [ChowdhurySIGCOMM11,ChowdhurySIGCOMM14]

Page 9: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

TitanGoals:

9

Driveincreasingline-rates

LowCPUutilization

Per-flowfairness

Workoncommodity

NICs

Page 10: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Multiqueue FairnessinLinux:

• Flowarrivalstoeachtransmitqueuearedynamic• TheOSstatically usesaper-flowhashtoassignflowstoqueues• TheNICschedulerstatically usesdeficitround-robin(DRR)toprovideper-queuefairness• Inthedatacenter,theOSstatically choosesaTSOsize

10

Page 11: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

TitanDesign:Asflowsdynamicallyarriveandcomplete,inTitan:TheOSdynamically:• Assignsweightstoflows• Trackstheflowoccupancyofqueues• Picksqueuesforflows• UpdatestheNICwithqueueweights

TheNICdynamically:• AppliesqueueweightsfromtheOS

Page 12: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

CausesofUnfairness:

12

Multiqueue unfairness TSOunfairness

Page 13: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Problem:Hashcollisions

13

TXQ-2TXQ-1

Wire

PacketScheduler

F1

F3

TXQ-3

F2

Wire

F1F3 F2F1F2F2F2 F3

Multiqueueunfairness

Page 14: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Problem:Hashcollisions

14

TXQ-2TXQ-1

Wire

PacketScheduler

F1

TXQ-3

F2

Solution:DynamicQueueAssignment(DQA)• OSassignsaweighttoeachflow• DQApicksthequeuewiththelowestoccupancywhenaflowstarts• Queueoccupanciesareupdated:• Anytimeaflowstartsenqueuing data• Anytimeaflowhasnoenqueued bytes(atmosteachTXinterrupt)

F3

Page 15: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Problem:Hashcollisions

15

TXQ-2TXQ-1

Wire

PacketScheduler

F1

TXQ-3

F2

Wire

F1F3 F2

F3

Solution:DynamicQueueAssignment(DQA)

F1F3 F2F1F3 F2

Page 16: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Problem:AsymmetricOversubscription

16

TXQ-2TXQ-1

Wire

PacketScheduler

F1

TXQ-3

F3F2

F4

Wire

F1F3F4F1F3F4F2F3F4F2F3F4

F1andF2receivehalfthroughput

W:1 W:1 W:1

Page 17: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Problem:AsymmetricOversubscription

17

Solution:DynamicQueueWeightAssignment(DQWA)

TXQ-2TXQ-1

Wire

PacketScheduler

F1

TXQ-3

F3F2

F4

ndo_set_tx_weight

• OSassignsweightstoflows• OSupdatestheNICschedulerwithqueueoccupanciesasflowsstartandstop(atmosteachTXinterrupt)• NICupdatesDRRweights

W:2 W:1 W:1

ThisisimplementableonexistingcommodityNICsbecauseitonlyneedstoupdateDRRweights!

Page 18: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Problem:AsymmetricOversubscription

18

Solution:DynamicQueueWeightAssignment(DQWA)

TXQ-2TXQ-1

Wire

PacketScheduler

F1

TXQ-3

F3F2

F4

ndo_set_tx_weight

Wire

F1F3F4 F1F2F3F4 F2

DQAandDQWAprovidelong-termfairness

W:2 W:1 W:1

ThisisimplementableonexistingcommodityNICsbecauseitonlyneedstoupdateDRRweights!

Page 19: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Problem:TSOUnfairness

19

TXQ-2TXQ-1

Wire

PacketScheduler

F1

TXQ-3

F3F2

F4

Wire

F1F3F4 F1F2F3F4 F2Short-termunfairness

W:2 W:1 W:1

• Short-termunfairnesscancauseburstsofcongestioninthenetwork• Short-termunfairnesscanincreaselatency

Page 20: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Problem:TSOUnfairness

20

Solution:DynamicSegmentationOffloadSizing(DSOS)

TXQ-2TXQ-1

Wire

PacketScheduler

F1

TXQ-3

F3F2 F4

Wire

F1F3F4 F2F1F3F4 F2

• DSOSdynamicallychangesthesegmentsizeduringoversubscription• SameimplementationasGSO

• CPUvsfairnesstradeoff• SegmentingaftertheTCP/IPstackreducesCPUcosts

F1F2

W:2 W:1 W:1

Page 21: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Implementation

• DQA,DQWA,andDSOSareimplementedinLinux4.4.6

• Supportforndo_set_tx_weight isimplementedintheIntelixgbe driverfortheIntel8259910GbpsNIC

• Titanisopensource!

21https://github.com/bestephe/titan

Page 22: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Evaluation• Microbenchmarks• 2servers,1switch• 8queueNICs• Varynumberofflows(levelofoversubscription)

• IncrementalfairnessbenefitsofDQA,DQWA,andDSOS• DQAandDQWA:expectedtoimprovelong-termfairness

• DSOS:expectedtoimproveshort-termfairness

22

Page 23: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Evaluation– FairnessMetricMetrics:• Normalizedfairnessmetric

(NFM)inspiredbyShreedhar andVarghese:• NFM=0isfair• NFM>1isveryunfair

23

Wire

F1F3 F2F1F2F2F2 F3Wire

F3Idealpacket

schedule:

Unfairpacket

schedule:

F1F2F1F3 F2F1F3 F2NFM=0

NFM=1

NFM = (Bytes(MaxFlow) –Bytes(MinFlow)) /Bytes(FairShair)

Page 24: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Microbenchmarks – 1sTimescale

24

0

0.5

1

1.5

2

2.5

6 12 24 48

NFM

-1s

NumberofFlowsLinux DQA DQA+DQWA DQA+DQWA+DSOS(16KB)

• Linuxisunfairatallsubscriptionlevels• DQAoftensignificantlyimprovesfairness• At48flows,flowchurnpreventsDQAfromevenlyspreadingflows

• DQWAimprovesfairnesswhenDQAcannotevenlyspreadflowsacrossqueues• DSOSdoesnothaveasignificantimpactonlong-termfairness

Page 25: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Microbenchmarks – 1msTimescale

25

0

1

2

3

4

5

6

6 12 24 48

NFM

-1ms

NumberofFlowsLinux DQA DQA+DQWA DQA+DQWA+DSOS(16KB)

• Atshorttimescalesandunderoversubscription,DQAandDQWAdonotsignificantlyimprovefairness• TSOistheprimarycauseofunfairness

• DSOS(16KB)oftenreducesunfairnessby>2x

Page 26: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

ClusterExperiments

26

CDFofcompletiontimesina1GBall-to-allshuffle(24servers)

2.5 3.0 3.5 4.0 4.5 5.0 5.5Flow Completion Time (s)

0.0

0.2

0.4

0.6

0.8

1.0

Cum

ulat

ive

Prob

abilit

y (a) 6 servers

4 5 6 7 8 9 101112Flow Completion Time (s)

0.0

0.2

0.4

0.6

0.8

1.0

Cum

ulat

ive

Prob

abilit

y (b) 12 servers

10 12 14 16 18 20 22 24Flow Completion Time (s)

0.0

0.2

0.4

0.6

0.8

1.0

Cum

ulat

ive

Prob

abilit

y (c) 24 servers

Vanilla Vanilla (Cmax) Titan

2.5 3.0 3.5 4.0 4.5 5.0 5.5Flow Completion Time (s)

0.0

0.2

0.4

0.6

0.8

1.0

Cum

ulat

ive

Prob

abilit

y (a) 6 servers

4 5 6 7 8 9 101112Flow Completion Time (s)

0.0

0.2

0.4

0.6

0.8

1.0

Cum

ulat

ive

Prob

abilit

y (b) 12 servers

10 12 14 16 18 20 22 24Flow Completion Time (s)

0.0

0.2

0.4

0.6

0.8

1.0

Cum

ulat

ive

Prob

abilit

y (c) 24 servers

Vanilla Vanilla (Cmax) Titan

2.5 3.0 3.5 4.0 4.5 5.0 5.5Flow Completion Time (s)

0.0

0.2

0.4

0.6

0.8

1.0

Cum

ulat

ive

Prob

abilit

y (a) 6 servers

4 5 6 7 8 9 101112Flow Completion Time (s)

0.0

0.2

0.4

0.6

0.8

1.0

Cum

ulat

ive

Prob

abilit

y (b) 12 servers

10 12 14 16 18 20 22 24Flow Completion Time (s)

0.0

0.2

0.4

0.6

0.8

1.0

Cum

ulat

ive

Prob

abilit

y (c) 24 servers

Vanilla Vanilla (Cmax) TitanLinuxTitanimprovesfairnesswithoutchangingthenetworkcore!

• IdealCDFwouldbeaverticalline• Titanmakesperformancemorepredictable• Titanimprovestailperformance(>90th percentile)

Page 27: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

AdditionalEvaluationAdditionalperformancemetrics:• Throughput:line-rate• Latency:nosignificantchange• CPUUtilization:• DQAandDQWA:increase<10%• DSOSisbetterthanstaticallydecreasing

theTSOsize• DSOSmotivatescreatingabetterTSO

implementation(zero-copy)

Linuxnetworkconfigurationtrade-offstudy• Seepaper

27

Page 28: Titan: Fair Packet Scheduling for Commodity MultiqueueNICs · Titan: Fair Packet Scheduling for Commodity MultiqueueNICs Brent Stephens, Arjun Singhvi, Aditya Akella, and Mike Swift

Summary

•MultiqueueNICscanleadtosignificantflow-levelunfairness• TitansignificantlyimprovesfairnessbyallowingtheOStodynamically interactwiththeNICpacketscheduler• TitanisimplementableoncommodityNICs!

28

https://github.com/bestephe/titan