Top Banner
Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh , Paolo Costa, Antony Rowstron, Greg O’Shea, Austin Donnelly Cornell University Microsoft Research Cambridge 1
27

Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Mar 30, 2015

Download

Documents

Norma Claypoole
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Symbiotic Routingin Future Data Centers

Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron,Greg O’Shea, Austin Donnelly

Cornell University Microsoft Research Cambridge

1

Page 2: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Data center networking• Network principles evolved from Internet systems• Multiple administrative domains• Heterogeneous environment

• But data centers are different• Single administrative domains• Total control over all operational aspects

• Re-examine the network in this new setting

2

Page 3: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Perf

orm

ance

Isol

ation

Band

wid

th

Faul

t Tol

eran

ce

Gra

cefu

l Deg

rada

tion

Scal

abili

ty

TCO

Com

mod

ity C

ompo

nent

s

. . .

Mod

ular

Des

ign

Rethinking DC networks• New proposals for data center network architectures• DCell, BCube, Fat-tree, VL2, PortLand …

• Network interface has not changed!

Network Interface

3

Page 4: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Challenge• The network is a black box to applications• Must infer network properties

• Locality, congestion, failure …etc• Little or no control over routing

• Applications are a black box to the network• Must infer flow properties

• E.g. Traffic engineering/Hedera

• In consequence• Today’s data centers and proposals use a single protocol• Routing trade-offs made in an application-agnostic way

• E.g. Latency, throughput, …etc4

Page 5: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

CamCube• A new data center design

• Nodes are commodity x86 servers with local storage• Container-based model 1,500-2,500 servers

• Direct-connect 3D torus topology• Six Ethernet ports / server• Servers have (x,y,z) coordinates

• Defines coordinate space• Simple 1-hop API

• Send/receive packets to/from 1-hop neighbours• Not using TCP/IP

• Everything is a service• Run on all servers

• Multi-hop routing is a service• Simple link state protocol• Route packets along shortest paths from source to destination

5

(0,2,0)

x

y

z

Page 6: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Development experience• Built many data center services on CamCube

• E.g.• High-throughput transport service

• Desired property: high throughput• Large-file multicast service

• Desired property: low link load• Aggregation service

• Desired property: distribute computation load over servers• Distributed object cache service

• Desired property: per-key caches, low path stretch

6

Page 7: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Per-service routing protocols• Higher flexibility

• Services optimize for different objectives• High throughput transport disjoint paths

• Increases throughput

• File multicast non-disjoint paths• Decreases network load

7

Page 8: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

What is the benefit?• Prototype Testbed• 27 servers, 3x3x3 CamCube• Quad core, 4 GB RAM, six 1Gbps Ethernet ports

• Large-scale packet-level discrete event simulator• 8,000 servers, 20x20x20 CamCube• 1Gbps links

• Service code runs unmodified on cluster and simulator

8

Page 9: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Service-level benefits• High throughput transport service• 1 sender 2000 receivers

• Sequential iteration• 10,000 packets/flow• 1500 bytes/packet

• Metric: throughput• Shown: custom/base ratio

9

0 1 2 3 4 50

0.25

0.5

0.75

1

Custom/Base Throughput Ratio

CDF

Flow

s

Page 10: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Service-level benefits• Large-file multicast service• 8,000-server network• 1 multicast group• Group size: 0% 100% of servers

• Metric: # of links in multicast tree• Shown: custom/base ratio

10

0%10%

20%30%

40%50%

60%70%

80%90%

100%0

0.1

0.2

0.3

0.4

Number of servers in the group (%)

Link

s re

ducti

on

Page 11: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Service-level benefits• Distributed object cache service• 8,000-server network• 8,000,000 key-object pairs

• Evenly distributed among servers• 800,000 lookups

• 100 lookups per server• Keys picked by Zipf distribution

• 1 primary + 8 replicas per key• Replicas unpopulated initially

• Metric: path length to nearest hit

11

0 5 10 15 20 25 300

0.25

0.5

0.75

1

Custom Routing

Base Routing

Path length

CDF

Look

ups

Page 12: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Network impact• Ran all services simultaneously• No correlation in link usage• Reduction in link utilization

• Take-away: custom routing reduced network load and increased service-level performance

12

0 services 1 service 2 services 3 services 4 services0

0.2

0.4

0.6

Services per link

Frac

tion

of l

inks

Key-value Cache

Multicast

Fixed Path

Aggregation

High-Throughput T

ransport

00.20.40.60.8

1

Change in link utilization

Cust

om/b

ase

pack

et ra

tio

Page 13: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Symbiotic routing relations• Multiple routing protocols running concurrently• Routing state shared with base routing protocol

• Services• Use one or more routing protocols• Use base protocol to simplify their custom protocols

• Network failures• Handled by base protocol• Services route for common case

13

Network

Base Routing Protocol

Routing Protocol 1 Routing Protocol 2 Routing Protocol 3

Service A Service B Service C

Page 14: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Building a routing framework• Simplify building custom routing protocols

• Routing:• Build routes from set of intermediate points

• Coordinates in the coordinate space• Services provide forwarding function ‘F’• Framework routes between intermediate points

• Use base routing service• Consistently remap coordinate space on node failure

• Queuing:• Services manage packet queues per link• Fair queuing between services per link

14

Fpacket

local coordnext coord

Page 15: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Example: cache service• Distributed key-object caching

• Key-space mapped onto CamCube coordinate space

• Per-key caches• Evenly distributed across coordinate space• Cache coordinates easily computable based on key

15

Page 16: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Cache service routing• Routing• Source nearest cache or primary• On cache miss: cache primary

• Populate cache: primary cache

• F function computed at• Source• Cache• Primary

• Different packets can use different links• Accommodate network conditions

• E.g. congestion16

Fv FF

v

v

v

source/querier

nearest cache

primary server

Page 17: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

• On link failure• Base protocol routes around failure

• On replica server failure• Key space consistently remapped

by framework

• F function does not change• Developer only targets common case• Framework handles corner cases

Handling failures

17

F

v

source/querier

nearest cache

primary server

Page 18: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Cache service F functionprotected override List<ulong> F(int neighborIndex, ulong currentDestinationKey, Packet packet) { List<ulong> nextKeys = new List<ulong>(); ulong itemKey = LookupPacket.GetItemKey(packet); ulong sourceKey = LookupPacket.GetSourceKey(packet);

if (currentDestinationKey == sourceKey) // am I the source? { // get the list of caches (using KeyValueStore static method) ulong[] cachesKey = ServiceKeyValueStore.GetCaches(itemKey);

// iterate over all cache nodes and keep the closest ones int minDistance = int.MaxValue; foreach (ulong cacheKey in cachesKey) { int distance = node.nodeid.DistanceTo(LongKeyToKeyCoord(cacheKey)); if (distance < minDistance) { nextKeys.Clear(); nextKeys.Add(cacheKey); minDistance = distance; } else if (distance == minDistance) nextKeys.Add(cacheKey); } }

else if (currentDestinationKey != itemKey) // am I the cache? nextKeys.Add(itemKey);

return nextKeys; }

18

extract packet details

if at source, route to nearest cacheor primary

if cache miss,route to primary

Page 19: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Framework overhead• Benchmark performance• Single server in testbed• Communicate with all six 1-hop neighbors (Tx + Rx)• Sustained 11.8 Gbps throughput

• Out of upper bound of 12 Gbps

• User-space routing overhead

19

Baseline Framework0

20

40

60

80

100

CPU

Util

izati

on (%

)

Page 20: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

What have we done• Services only specify a routing “skeleton”• Framework fills in the details

• Control messages and failures handled by framework• Reduce routing complexity for services

• Opt-in basis• Services define custom protocols only if they need to

20

Page 21: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Network requirements• Per-service routing not limited to CamCube

• Network need only provide:• Path diversity

• Providing routing options• Topology awareness

• Expose server locality and connectivity• Programmable components

• Allow per-service routing logic

21

Page 22: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Conclusions• Data center networking from the developer’s perspective

• Custom routing protocols to optimize for application-level performance requirements

• Presented a framework for custom routing protocols• Applications specify a forwarding function (F) and queuing hints• Framework manages network state, control messages, and

remapping on failure

• Multiple routing protocols running concurrently• Increase application-level performance• Decrease network load

22

Page 23: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Thank [email protected]

23

Page 24: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Cache serviceInsert throughput

0 20 40 60 80 100 120 1400

0.5

1

1.5

2

2.5

3

3.5

4

F=3, disk

F=27, disk

F=3, no disk

F=27, no disk

Concurrent insert requests

Inse

rt th

roug

hput

(Gbp

s)

Disk I/O bounded

Ingress bandwidth bounded (3 front-ends)

24

Page 25: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Cache serviceLookup requests/second

0 20 40 60 80 100 120 1400

20,000

40,000

60,000

80,000

100,000

120,000

140,000

F=3 F=27

Concurrent lookup requests

Look

up ra

te (r

eqs/

s) Ingress bandwidth bounded

25

Page 26: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Cache serviceCPU Utilization on FEs

0 20 40 60 80 100 120 1400

10

20

30

40

50

60

70

80

90

100lookup (F=3)

insert (F=3, no disk)

insert (F=27, no disk)

lookup (F=27)

Concurrent requests

CPU

util

izati

on (%

)

3 front-ends

27 front-ends

26

Page 27: Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh, Paolo Costa, Antony Rowstron, Greg OShea, Austin Donnelly Cornell University Microsoft Research.

Camcube link latency

1,500-byte packets 9,000-byte packets0

100

200

300

400

500

600

700

800

900

1000UDP (x-cable)

Camcube (1 hop)

UDP (switch)

TCP (x-cable)

TCP (switch)

Roun

d tr

ip ti

me

(mic

rose

c)

27