2nd Symposium on Networked Systems Design & Implementation (NSDI)
Boston, MAMay 2-4, 2005
Guohui Wang3, David G. Andersen2, Michael Kaminsky1, Michael Kozuch1,
T. S. Eugene Ng3, Dina Papagiannaki1, Madeleine Glick1 and Lily Mummert1,
1. Intel Labs Pittsburgh 2. Carnegie Mellon University 3. Rice University
1
Your Data Center Is a Router: The Case for Reconfigurable Optical Circuit Switched Paths
2
Data Center NetworkData Center Network
Today’s Data Center Network
Data intensive applications are experiencing bandwidth bottleneck in the tree structure data center networks. E.g. Video data processing, MapReduce …
End of Row Switch
Top of Rack Switch
CoreSwitch
Picture from: James Hamilton, Architecture for Modular Data Centers
3
Full bisection bandwidth solutionsFull bisection bandwidth solutions
Re-structure data center network to provide full bisection bandwidth among all the servers.
Complicated network structure, hard to construct and expand.
Tree
FatTree BCube Picture from: Ken Hall, Green Data Centers
4
Full bisection bandwidth may not be necessaryFull bisection bandwidth may not be necessary
Spatial Traffic Locality– Nodes only communicate
with a small number of partners.
– e.g. Earthquake simulation
Temporal Traffic Locality– Applications might hit
CPU, disk IO or Sync bounds.
– e.g. MapReduce
Many measurement studies have suggested evidence of traffic locality. – [SC05][WREN09][IMC09][HotNets09]
Full bisection bandwidth solutions provide too much with high costs.
5
An alternative design: hybrid data center networkAn alternative design: hybrid data center network
Hybrid network may give us best of both worlds: – Optical circuit-switched paths for data intensive transfer.– Electrical packet-switched paths for timely delivery.
A B C D E F
Optical circuit-switched network
Electrical packet-switched network
6
Optical Optical CircuitCircuit Switching Switching
MEMS Optical Switching Module
Switching at whatever rate modulated on input/output ports
Up to tens of ms physical reconfiguration time
Picture from: http://www.ntt.co.jp/milab/en/project/pr05_3Dmems.html
7
Optical ChannelsOptical Channels
Ultra-high bandwidth
Dropping prices
40G, 100Gbps technology has been developed.
15.5Tbps over a single fiber!
Price of Optical Tranceivers
0
0.2
0.4
0.6
0.8
1
1.22000
2002
2004
2006
2008
2010
2012
Year
Cost
s
OC-192,10Gbps, OvumRHKOC-192, 10Gbps,LightcountingOC-768, 40Gbps, VSRPrice data from: Joe
Berthold, Hot Interconnects’09
8
Optical circuits in datacentersOptical circuits in datacenters
A - E, B - D, C - F
A - D, B - E, C - F
A - F, B - E, C - D
A B C D E F
Advantage:– Simple and flexible:
easy to construct, expand and manage
– Ultra-high bandwidth– Low power
Disadvantage:– Fat pipes are not all-to-all. – Reconfiguration overhead
9
Research questionsResearch questions
• Enough traffic locality in data centers to leverage optical path?
• Reconfigure optical paths fast enough to meet dynamic traffic?
• How to integrate optical circuits into data centers at low costs?
• How to manage and leverage optical paths?
• How do applications behave over the hybrid network?
10
Is there enough traffic locality? Is there enough traffic locality?
Analyzing production data center traffic trace: – 7 racks, 155 servers,
1060 cores
– One week NetFlow traces collected at all servers
– Configure 3 optical paths out of total 21 cross-rack paths with maximum optical traffic, reconfigure every 10s.
Traffic locality: a few optical paths have the potential to offload significant amount of traffic from electrical networks.
10 sec TM
Time
10s 10s 10s
…
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
EvenlyDistributed
Traffic
Real Trafficis skewed
Fraction ofOptical Traffic(on average)
11
R1R2
R3R4
R5
R6R7
R8
wxy= vol(Rx, Ry) + vol(Ry, Rx)Graph G: (V, E)
w12
w14 w43
w38
w68w36
w35
w27
w47
Can optical paths be reconfigured fast enough? Can optical paths be reconfigured fast enough? - - Optical Path Configuration AlgorithmOptical Path Configuration Algorithm
R1
R2
R3
R4
R5
R6
R7
R8
R1 R2 R3 R4 R5 R6 R7 R8
Optical path configuration is a maximum weight perfect matching on graph G.
Solved by polynomial time Edmonds’ algorithm[1]!
[1] J. Edmonds, Paths, trees and flowers, Canadian J. of Mathematics, pp 449-467, 1965
12
Can optical paths be reconfigured fast enough? Can optical paths be reconfigured fast enough? - - Optical Path Configuration TimeOptical Path Configuration Time
Several time factors– Computation time
• 640ms for a 1000-rack data center using Edmonds’ algorithm.
– Signaling time • < 1ms in data centers
– Physical reconfiguration time• Up to tens ms for MEMS
optical switches
Even in very large data centers, optical paths can still be reconfigured at small time scales (< 1 sec).
13
How to manage optical paths in data centers? How to manage optical paths in data centers?
How to measure application traffic demand?
Extensive buffering at servers– Traffic demands
measurement– Aggregate traffic
and batch for optical transfer
Per-rack virtual output queuing: – Avoid head-of-line
blocking
Kernel
User
Apps
Network Interface
Servers
Per-rack Virtual Output Queue
Scheduler
14
How to manage optical paths in data centers?How to manage optical paths in data centers?
Daemon
Kernel
Stats
Config
ConfigStats
User
Configuration Manager
Apps
Network Interface
Switches with VLAN settings
Traffic
Config
Servers
Scheduler
Per-rack Virtual Output Queue
How to configure optical paths and schedule traffic to them?
A centralized manager to control the optical path configuration.
Configurable virtual output queue scheduler to control traffic to optical paths.
A B C D
15
ChallengesChallenges
• TCP/IP reacting to optical path reconfiguration.
• Potential long delays caused by extensive queuing at servers.
• Collecting traffic demand from a million servers.
• Choosing the right buffer sizes and reconfiguration intervals.
16
SummarySummary
Adding optical circuit switched paths into data centers.
Potential benefits:
• A simpler and flexible data center network design.
• Relieving data intensive applications from network bottlenecks.