Top Banner
Towards a Redundancy-Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts) Ihsan Ayyub Qazi (LUMS) Fahad R. Dogar (Tufts)
60

Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

May 21, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

TowardsaRedundancy-AwareNetworkStackforDataCenters

AliMusaIftikhar(Tufts)

Ihsan AyyubQazi(LUMS)

FahadR.Dogar(Tufts)

Page 2: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

TheProblemofTailLatency inDataCenters!

2

Page 3: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

TheProblemofTailLatency inDataCenters!

Highfan-out

3

Page 4: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

TheProblemofTailLatency inDataCenters!

• Loadimbalance• Backgroundtasks• Failures,etc.

Highfan-out

Straggler

4

Page 5: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

TheProblemofTailLatency inDataCenters!

• Loadimbalance• Backgroundtasks• Failures,etc.

LongtaillatencyHighfan-out Stragglers+

Highfan-out

Straggler

5

Page 6: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Howtoavoidstragglers?

Reactively Proactively

6

Page 7: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Howtoavoidstragglers?

Reactively Proactively

PRO:lowoverhead

Hopper(SIGCOMM’15)C3(NSDI’15)

Sinbad(SIGCOMM’13)

CON:requiresstragglerdetection(slowandinaccurate)

7

Page 8: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Howtoavoidstragglers?

Reactively Proactively

PRO:lowoverhead

Dolly(NSDI’13)Lowlatencyvia

redundancy(CoNext’13)

Hopper(SIGCOMM’15)C3(NSDI’15)

Sinbad(SIGCOMM’13)

PRO:fastandaccurate

CON:requiresdeterminingthresholdload(non-trivial)

CON:requiresstragglerdetection(slowandinaccurate)

8

Page 9: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Howtoavoidstragglers?

Reactively Proactively

PRO:lowoverhead

Can we achieve the benefits of both without their limitations?

Dolly(NSDI’13)Lowlatencyvia

redundancy(CoNext’13)

Hopper(SIGCOMM’15)C3(NSDI’15)

Sinbad(SIGCOMM’13)

PRO:fastandaccurate

CON:requiresdeterminingthresholdload(non-trivial)

CON:requiresstragglerdetection(slowandinaccurate)

9

Page 10: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Overview• Duplicate-Aware Scheduling Framework

• Redundancy-Aware Network Stack

• Preliminary Results

Genericframework

NewnetworkstackforDC

10

Page 11: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Duplicate-awarescheduling

Replica1

Client

Replica2 11

Page 12: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Duplicate-awarescheduling

Replica1

Client

Replica2

high

low

high

low

1. PriorityQueues

12

Page 13: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Duplicate-awarescheduling

Replica1

Client

Replica2

high

low

high

low

request

1. PriorityQueues

13

Page 14: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Duplicate-awarescheduling

Replica1

Client

Replica2

high

low

high

low

request

1. PriorityQueues

14

Page 15: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Duplicate-awarescheduling

Replica1

Client

Replica2

high

low

high

low

P

request

B

1. PriorityQueues

15

Page 16: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Duplicate-awarescheduling

Replica1

Client

Replica2

high

low

P

high

lowB

request

1. PriorityQueues

16

Page 17: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Duplicate-awarescheduling

Replica1

Client

Replica2

high

low

high

lowB

request

1. PriorityQueues

17

Page 18: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Duplicate-awarescheduling

Replica1

Client

Replica2

high

low

high

lowB

request

purge

1. PriorityQueues2. Purging

18

Page 19: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

NeedforPriorityQueuing

high

lowbackup

primary

19

ØDuplicationhasanoverhead!

L

Page 20: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

NeedforPriorityQueuing

high

lowbackup

primary

üStrictprioritiesüWorkconservationüPreemption

20

ØDuplicationhasanoverhead!

LPropertiesrequired:

Page 21: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

NeedforPriorityQueuing

high

lowbackup

primary

üStrictprioritiesüWorkconservationüPreemption

21

ØDuplicationhasanoverhead!

LPropertiesrequired:

PQ makes the overhead of duplication low. sJ

Page 22: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

NeedforPriorityQueuing

high

lowbackup

primary

üStrictprioritiesüWorkconservationüPreemption

22

ØDuplicationhasanoverhead!

LPropertiesrequired:

PQ makes the overhead of duplication low. sJessential

Page 23: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

ImportanceofPurging

ØStalerequestsblocknewrequests.

Lhigh

lowreq1req2

stale

23

Page 24: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

ImportanceofPurging

high

lowreq1req2

stale

ØStalerequestsblocknewrequests.

L

Purging makes the system more efficient! A J24

Page 25: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

ImportanceofPurging

high

lowreq1req2

stale

ØStalerequestsblocknewrequests.

L

Purging makes the system more efficient! A Joptimization 25

Page 26: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RealizingDuplicate-AwareSchedulingateverypotentialbottleneck resourceinaDC

26

Page 27: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RealizingDuplicate-AwareSchedulingateverypotentialbottleneck resourceinaDC

Network

27

Page 28: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RealizingDuplicate-AwareSchedulingateverypotentialbottleneck resourceinaDC

Compute

Network

28

Page 29: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RealizingDuplicate-AwareSchedulingateverypotentialbottleneck resourceinaDC

Memory

Compute

Network

29

Page 30: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RealizingDuplicate-AwareSchedulingateverypotentialbottleneck resourceinaDC

GFSHDFS BigTable

Memory

Compute

Filesystem/Database

Network

30

Page 31: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RealizingDuplicate-AwareSchedulingateverypotentialbottleneck resourceinaDC

GFSHDFS BigTable

Storage

Memory

Compute

Filesystem/Database

Network

31

Page 32: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

ateverypotentialbottleneck resourceinaDC

GFSHDFS BigTable

Memory

Compute

Storage

Network

Filesystem/Database

In-networkpurging

Prioritization

Purging+preemption

challenges

32

RealizingDuplicate-AwareScheduling

Page 33: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RedundancyAwareNetworkStack(RANS)

Application

Transport

Link

Network

Duplicate-Awareness

Pointtomultipoint

PriorityQueues

Physical Sameasbefore

Sameasbefore

ExpressiveInterface

Layer NewRole

+purging

+purging

33

Page 34: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RedundancyAwareNetworkStack(RANS)

Application

Transport

Link

Network

Duplicate-Awareness

Pointtomultipoint

PriorityQueues

Physical Sameasbefore

Sameasbefore

ExpressiveInterface

Layer NewRole

+purging

Applicationsneedtobemodified.

challenge

+purging

34

Page 35: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RedundancyAwareNetworkStack(RANS)

Application

Transport

Link

Network

Duplicate-Awareness

Pointtomultipoint

PriorityQueues

Physical Sameasbefore

Sameasbefore

ExpressiveInterface

Layer NewRole

+purging

Applicationsneedtobemodified.

Expressiveinterfaceallowsrichcommunicationb/wAppand

Transport.E.g.DAG

challenge

opportunity

+purging

35

Page 36: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RedundancyAwareNetworkStack(RANS)

Application

Transport

Link

Network

Duplicate-Awareness

Pointtomultipoint

PriorityQueues

Physical Sameasbefore

Sameasbefore

ExpressiveInterface

Layer NewRole

+purging

Applicationsneedtobemodified.

Expressiveinterfaceallowsrichcommunicationb/wAppand

Transport.E.g.DAG

challenge

opportunity

Hardtoimplementperpacketpurging.

challenge+purging

36

Page 37: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RedundancyAwareNetworkStack(RANS)

Application

Transport

Link

Network

Duplicate-Awareness

Pointtomultipoint

PriorityQueues

Physical Sameasbefore

Sameasbefore

ExpressiveInterface

Layer NewRole

+purging

Applicationsneedtobemodified.

Expressiveinterfaceallowsrichcommunicationb/wAppand

Transport.E.g.DAG

challenge

opportunity

Hardtoimplementperpacketpurging.

challenge

AddssupportforexistingPQsinDCswitches.

opportunity

+purging

37

Page 38: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RedundancyAwareNetworkStack(RANS)

Application

Transport

Link

Network

Duplicate-Awareness

Pointtomultipoint

PriorityQueues

Physical Sameasbefore

Sameasbefore

ExpressiveInterface

Layer NewRole

+purging

Applicationsneedtobemodified.

Expressiveinterfaceallowsrichcommunicationb/wAppand

Transport.E.g.DAG

challenge

opportunity

Hardtoimplementperpacketpurging.

challenge

AddssupportforexistingPQsinDCswitches.

opportunity

+purging

38

Page 39: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

e.g.Improvedfaulttolerance

üMultipath

üMulti-destination

RANSTransport:PointtoMulti-point

ØEnables:Richtransport

Sender1(replica1)

Receiver(client)

Sender2(replica2)

39

Page 40: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RANSTransport:ByteAggregation

Sender1(replica1)

Receiver(client)

Sender2(replica2)

ØOpportunity:Receiverdriventransport

Response

e.g.Moreefficientcongestioncontrol(2xormore)

üTwoormoreresponsestreams

üAggregatebytesatreceiverside

40

Page 41: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RANSTransport:PriorityAssignment

Sender1(replica1)

Receiver(client)

Sender2(replica2)

ØDynamicreplicaassignment

Response+Feedback

e.g.Improvedreplicaassignment

üFinegrainedmonitoringofcongestionwindow

üDynamicallyreprioritizeflows

üFeedbacktoApplication

41

Page 42: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Overview• Duplicate-Aware Scheduling Framework

• Redundancy-Aware Network Stack

• Preliminary Results

42

Page 43: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

PreliminaryEvaluation:ns-2setupdetailsØ Storagescenario

Client

10servers

Replica1

Replica2

43

Page 44: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

PreliminaryEvaluation:ns-2setupdetailsØ Storagescenario

Client

10servers

Replica1

Replica2

bottlenecks

44

Page 45: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

PreliminaryEvaluation:ns-2setupdetails

TrafficDetails

Totalrequests 20K

Arrivalprocess Poisson

Server&replicaselection Uniformlyrandom

Ø Storagescenario

Client

10servers

Replica1

Replica2

bottlenecks

45

Page 46: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

PreliminaryEvaluation:ns-2setupdetails

TrafficDetails

Totalrequests 20K

Arrivalprocess Poisson

Server&replicaselection Uniformlyrandom

Ø Storagescenario

Client

10servers

Replica1

Replica2

bottlenecks

The only source of stragglers is load imbalance.

46

Page 47: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Noduplicates(baseline)

2-copies(proactivew/oPQ)

+PQs

+Purging

+ByteAggregation(RANS)

Averagerequestcompletiontimeof:

Load(%)

Requ

estcom

pletiontim

e(s)

47

Page 48: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Noduplicates(baseline)

2-copies(proactivew/oPQ)

+PQs

+Purging

+ByteAggregation(RANS)

Averagerequestcompletiontimeof:

Load(%)

Requ

estcom

pletiontim

e(s)

48

Page 49: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Noduplicates(baseline)

2-copies(proactivew/oPQ)

+PQs

+Purging

+ByteAggregation(RANS)

Averagerequestcompletiontimeof:

Load(%)

Requ

estcom

pletiontim

e(s)

49

Page 50: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Noduplicates(baseline)

2-copies(proactivew/oPQ)

+PQs

+Purging

+ByteAggregation(RANS)

Averagerequestcompletiontimeof:

Load(%)

Requ

estcom

pletiontim

e(s)

50

Page 51: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Noduplicates(baseline)

2-copies(proactivew/oPQ)

+PQs

+Purging

+ByteAggregation(RANS)

Averagerequestcompletiontimeof:

Load(%)

Requ

estcom

pletiontim

e(s)

51

Page 52: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Noduplicates(baseline)

2-copies(proactivew/oPQ)

+PQs

+Purging

+ByteAggregation(RANS)

Averagerequestcompletiontimeof:

Load(%)

Requ

estcom

pletiontim

e(s)

~2X

52

Page 53: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Noduplicates(baseline)

2-copies(proactivew/oPQ)

+PQs

+Purging

+ByteAggregation(RANS)

Averagerequestcompletiontimeof:

Load(%)

Requ

estcom

pletiontim

e(s)

~2X

Expecting more gains even at lower loads with additional straggler sources.

53

Page 54: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Noduplicates(baseline)

2-copies(proactivew/oPQ)

+PQs

+Purging

+ByteAggregation(RANS)

Averagerequestcompletiontimeof:

Load(%)

Requ

estcom

pletiontim

e(s)

50-80% improvement over the baseline across all loads.

Expecting more gains even at lower loads with additional straggler sources.

~2X

54

Page 55: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Summary&Futurework

• TheIssueofStragglers

• Duplicate-AwareSchedulingFramework

• RANS

• ImplementinginHDFSandCassandra

Simpleyetchallengingsolution

Afirststeptowardsaduplicate-awarenetwork

55

Page 56: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

RANS:FeedbackandDiscussion

• AliMusaIftikhar([email protected])

• FahadR.Dogar ([email protected])

• Ihsan A.Qazi ([email protected])Transport

Link

Network

PointtomultipointByteaggregationPriorityassignment

PriorityQueues

Physical Sameasbefore

Sameasbefore

ExpressiveInterface

Application Duplicate-Awareness

Layer NewRole

+purging

+purging

56

Page 57: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Possiblequestions– backupslide• Preemptionoverhead

• Notreallyanissueinthenetworkbecausepacketsaresmall.

• Packetpurging• PFC(backpressure,buildqueuesattheendhostsandpurgethem)

• Droptheentireduplicatequeue(easierthanper-packetdrops)

• Recenttrendtowardsprogrammableswitches

• GainswithPQ• Moregainswithfailuresasstragglers(primaryundergoesafailure)

• Alsomorebenefitswithdifferentresources

• Duplicationoverheadatclient• Clientisusuallynotthebottleneck

• Non-Idempotentrequests• Wearetargetingtheclassofappswhichhaveflexibleendpointsandrequireatleastoncesemantics

• Replicatingonlysmallpacketsandprioritizingthem• Onlybeneficialwithbursty smallflows• HDFShaveatypicalchunksizeb/w64MB-128MB

• Quorumsystems• RANScomplementssuchsystems,theycanusethistechniqueandsendKoutofNrequestsathighprio whileN-Kasbackups

• Can’tjustimplementattheappandgetthesamebenefits?• Networkcouldbeabottleneck• Finegrainedcontrol,muchmorecontrol

• Rootcausesofperformanceimprovement• PQavoidsoverheads• Nowwecaneasilygetthebenefitsofduplicationslikeaggregationetc.

• Purgingwillalsoattimespurgeprimarymakingthesystemmoreefficient.

57

Page 58: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Foodforthought

DCPrimary DCFailover

InterDCDuplicate-AwareScheduling

e.g.Google’sGeo-DistributedDatabase“Spanner”(OSDI’12)

58

Page 59: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

Foodforthought

DCPrimary DCFailover

InterDCDuplicate-AwareScheduling

e.g.Google’sGeo-DistributedDatabase“Spanner”(OSDI’12)

Prefetch

SearchsuggestionsSpellcheck

• Searchenginesdropspellcheck,suggestions,etc.athighloads.

• Canbenefitfromduplicate-awarescheduling.

59

Page 60: Towards a Redundancy -Aware Network Stack for Data Centersmusa/uploads/hotnets_2016_talk.pdf · Towards a Redundancy -Aware Network Stack for Data Centers Ali Musa Iftikhar (Tufts)

WhenRANSworksbest?

• Applicationfanout ishighandstragglersarefrequent.• End-pointsareflexibleand“atleastonce”semanticsaresufficient.• Clientisnotthebottleneck.• Requestsizesaresmall(orpreemptionoverheadisminimal).

60