Top Banner
DRFQ: Multi-Resource Fair Queueing for Packet Processing Ali Ghodsi 1,3 , Vyas Sekar 2 , Matei Zaharia 1 , Ion Stoica 1 1 UC Berkeley, 2 Intel ISTC/Stony Brook, 3 KTH 1

DRFQ : Multi-Resource Fair Queueing for Packet Processing

Feb 24, 2016




DRFQ : Multi-Resource Fair Queueing for Packet Processing. Ali Ghodsi 1,3 , Vyas Sekar 2 , Matei Zaharia 1 , Ion Stoica 1 1 UC Berkeley, 2 Intel ISTC/Stony Brook, 3 KTH. Increasing Network Complexity. Packet processing becoming evermore sophisticated Software Defined Networking (SDN) - PowerPoint PPT Presentation

Multi-Resource Fair Queueing for Packet Processing

DRFQ: Multi-Resource Fair Queueing for Packet ProcessingAli Ghodsi1,3, Vyas Sekar2, Matei Zaharia1, Ion Stoica11UC Berkeley, 2Intel ISTC/Stony Brook, 3KTH1Increasing Network ComplexityPacket processing becoming evermore sophisticatedSoftware Defined Networking (SDN)MiddleboxesSoftware Routers (e.g. RouteBricks)Hardware Acceleration (e.g. SSLShader)

Data plane no longer merely forwardingWAN optimizationCachingIDSVPN2We see that packet processing is becoming increasingly sophisticatedSeveral trends point in this directionWe just heard about Software Defined Networking, which is, among other things, increasingly used for access control and VPN services.We see a profusion of middleboxes, both within an enterprise, but as we saw in last SIGCOMM, it's also used by ISPs and cellular network providers for customer dataWe see the use of hardware accelerators, such as SSHShader,and finally the rise of Software Routers such as RouteBricks

The data plane, thus performs a variety of functionality,Including WAN optimization, HTTP caching, intrusion detection, VPN ...2MotivationFlows increasingly haveheterogeneous resource consumptionIntrusion detection bottlenecking on CPUSmall packets bottleneck memory-bandwidth Unprocessed large packets bottleneck on link bw3Scheduling based on a single resource insufficientNetPiculet, cellular nets Morley MaoTheo Bensons ISPs network-based services

3ProblemHow to schedule packets from different flows,when packets consume multiple resources?4How to generalize fair queueing to multiple resources?Contribution5Allocation in SpaceAllocation in TimeSingle-Resource FairnessMax-Min FairnessFair QueueingMulti-Resource FairnessDRFDRFQGeneralize Virtual Time to Multiple ResourcesOutlineAnalysis of Natural PoliciesDRF allocations in SpaceDRFQ: DRF allocations in TimeImplementation/Evaluation6Desirable Multi-Resource PropertiesShare guarantee:Each flow can get 1/n of at least one resource

Strategy-proofness:A flow shouldnt be able to finish faster by increasing the resources required to process it.

7Generalization of a key single-resource fair queueing property

Throughout talk ignore weights 7Violation of Share GuaranteeExample using traditional FQ Two resources CPU and NIC, used seriallyTwo flows with profiles and FQ based on NIC alternates one packet from each flowCPU bottlenecked due to more aggregate demand

Share Guarantee Violated by Single Resource FQFlow 2Flow 1100%50%0%CPUNIC66%33%33%33%8Emphasize units, and say we assume there is a single CPU, single NIC, but later return to this issue8Violation of Strategy-ProofnessBottleneck fairness by related workDetermine which resource is bottleneckedApply FQ to that resource

Example with Bottleneck Fairness 2 resources (CPU, NIC), 3 flows , , CPU bottlenecked and split equally

Flow 1 changes to . NIC bottlenecked and split equallyBottleneck Fairness Violates Strategy-ProofnessCPUNIC0%100%50%48%CPUNIC0%100%50%33%flow 1flow 2flow 333%9Is strategy-proofness important?Lack of strategy-proofness encourages wastageDecreasing goodput of the system

Networking applications especially savvyPeer-to-peer apps manipulate to get more resources

Trivially guaranteed for single resource fairnessBut not for multi-resource fairness

10Packet size manipulation, regexp manipulations10OutlineAnalysis of Natural PoliciesDRF allocations in SpaceDRFQ: DRF allocations in TimeImplementation/Evaluation11Dominant Resource FairnessDRF originally in the cloud computing contextSatisfies share guaranteeSatisfies strategy-proofness12DRF AllocationsDominant resource of a user is the resource she is allocated most of Dominant share is the users share of her dominant resource

DRF: apply max-min fairness to dominant sharesEqualize the dominant share of all users

Total resources: User 1 demand: dom res: CPUUser 2 demand: dom res: mem

13User 2User 1100%50%0%CPUmem3 CPUs12 GB12 CPUs4 GB66%66%Allocations in Space vs TimeDRF provides allocations in spaceGiven 1000 CPUs and 1 TB mem, how much to allocate to each user

DRFQ provides DRF allocations in timeMultiplex packets to achieve DRF allocations over time

14OutlineAnalysis of Natural PoliciesDRF allocations in SpaceDRFQ: DRF allocations in TimeImplementation/Evaluation1515Packet Resource ConsumptionLink usage of packets trivial in FQPacket size divided by throughput of link

Packet processing time a-priori unknown for multi-resourcesDepends on the modules that process it

Leverage Start-time Fair Queueing (SFQ)Schedules based on virtual start time of packetsStart time of packet p independent of resource consumption of packet p16To develop DRFQ, lets start by looking at some time multiplexing requirements16Memoryless RequirementLesson from Virtual ClockSimulated flows being dedicated a predefined 1/n share

ProblemDuring light load a flow might get more than 1/nA flow receiving more than 1/n gets punished later

Requirement: memoryless schedulingA flows share of resources should be independent of its share in the past

17Real TimeVirtual Time V(t)Virtual TimeVirtual time to track amount service receivedA unit of virtual timealways corresponds to sameamount of service

Example with 2 flowsTime 20: one backlogged flowTime 40: two backlogged flows

Schedule the packets according to V(t)Assign virtual start/finish time when packet arrives20406080406020slope 2slope 1Stress beg: one time unit sameStress end: same time but different service 18Dove-tailing RequirementPacket size doesnt affect service received in FQFlow with 10 1kb packets gets same service as 5 2kb packets

Use flow processing time, not packet processing timeExample: give same service to these flows:Flow 1:p1 ,p2 ,p3 ,p4 , Flow 2:p1 ,p2 , p3,p4 ,

Requirement: dove-tailingPacket processing times should be independent of how resource consumption is distributed in a flow19Why dovetailing? Dovetailing important because buffering19TradeoffDovetailing and memoryless property at oddsDovetailing needs to remember past consumption

DRFQ developed in three stepsMemoryless DRFQ: uses a single virtual timeDovetailing DRFQ : use virtual time per resourceDRFQ: generalizes both

20Memoryless DRFQAttach a virtual start and finish time to every packet

Computing virtual finish timefinish time = start time + packet-max-processing-time

Computing virtual start timeStart time of the first packet in a burst equals the start time of the packet currently serviced (zero if none)For a backlogged flow, the start time of a packet is equal to finish time of previous packet

Service the packet with minimum virtual start time

21Rule 2 essential in making memoryless property 21Memoryless DRFQ exampleTwo flows become backlogged at time 0Flow 1 alternates and packet processingFlow 2 uses packet processing time

finish time = start time + packet-max-processing-timestart time of first packet in burst equals start time of the packet currently serviced (zero if none)For backlogged flows, start time is finish time of previous packet

22Flow 1 P1S: 0 F: 2Flow 1 P2S: 2 F: 4Flow 1 P3S: 4 F: 6Flow 1 P4S: 6 F: 8Flow 1 P5S: 8 F: 10Flow 2 P1S: 0 F: 3Flow 2 P2S: 3 F: 6Flow 2 P3S: 6 F: 9Flow 1 gets worse service than Flow 2Dovetailing DRFQKeep track of start and finish time per resourceDovetail by keeping track of all resource usageFor each packet use the maximum start time

23Dovetailing DRFQ example24Flow 1S1: 0 F1: 1S2: 0 F2: 2Flow 1S1: 1 F1: 3S2: 2 F2: 3Flow 1S1: 3 F1: 4S2: 3 F2: 5Flow 1S1: 4 F1: 6S2: 5 F2: 6Flow 1S1: 6 F1: 7S2: 6 F2: 8Flow 2S1: 0 F1: 3S2: 0 F2: 3Flow 2S1: 3 F1: 6S2: 3 F2: 6Flow 2S1: 6 F1: 9S2: 6 F2: 9Dovetailing ensures both flows get same serviceTwo flows become backlogged at time 0Flow 1 alternates and packet processingFlow 2 uses per packet

DRFQ algorithmDRFQ bounds dovetailing to processing timeDovetail up to processing time unitsMemoryless beyond

DRFQ is a generalizationWhen =0 then DRFQ=memoryless DRFQWhen = then DRFQ=dovetailing DRFQ

Set to a few packets worth of processing25OutlineAnalysis of Natural PoliciesDRF allocations in SpaceDRFQ: DRF allocations in TimeImplementation/Evaluation26Isolation ExperimentDRFQ Implementation in Click2 elephants: 40K/sec basic, 40K/sec IPSec2 mice: 1/sec basic, 0.5/sec basic

Non-backlogged flows isolated from backlogged flows27Basic (link), IPSec (CPU), Basic, BasicQueues large for the elephants, affecting latency.Mice send few packets, do not get queues27Simulating Bottleneck Fairness2 flows and 2 res. Demands and bottleneck unclear

Especially bad for TCP and video/audio traffic

28SummaryPacket processing becoming evermore sophisticatedConsume multiple resources

Natural policies not suitablePer-Resource Fairness (PRF) not strategy-proofBottleneck Fairness doesnt provide isolation

Proposed Dominant Resource Fair Queueing (DRFQ)Generalization of FQ to multiple resourcesGeneralizes virtual time to multiple resourcesProvides tradeoff between memoryless and dovetailingProvides share-guarantee (isolation) and strategy-proofness

293031Natural PolicyPer-Resource Fairness (PRF)Have a buffer between each resourceApply fair queueing to each resource

PRF abandoned in favor of DRFQNot strategy-proofRequires per-resource buffersOverhead350 MB trace run through our Click implementation

Evaluate overhead of two modulesIntrusion Detection, 2% overheadFlow monitoring, 4% overhead

33Determining Resource ConsumptionResource consumption obvious in routersPacket size divided by link rate

Generalize consumption to processing timeNormalized time a resource takes to process packet

Normalized processing timee.g. 1 core takes 20s to service a packet, on a quad-core the packet processing time is 5sPacket processing time packet service time

34Module Consumption EstimationLinear estimation of processing timeFor module m and resource r as function of packet size

R2 > 0.90 for most modules

35Simulating Bottleneck Fairness2 flows and 2 res. Demands and bottleneck unclear

CPU bottleneck:7 + = NIC bottleneck: + 6 = Periodically oscillates the bottleneck

36TCP and oscillationsImplemented Bottleneck Fairness in Click20 ms artificial link delay added to simulate WANBottleneck determined every 300 ms1 BW-bound flow and 1 CPU-bound flow

Oscillations in Bottleneck degrade performance of TCP37Multi-Resource Consumption ContextsDifferent modules within a middleboxE.g. Bro modules for HTTP, FTP, telnet

Different apps on a consolidated middleboxDifferent applications consume different resources

Other contextsVM scheduling in hypervisorsRequests to a shared service (e.g. HDFS)38

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.