Ziyi Zhu , Shijia Yan, Madeleine Strom Glick, Min Yee Teh, and Keren Bergman Lightwave Research Lab, Columbia University New York, NY, US Email: [email protected]Silicon Photonic Switch-Enabled Server Regrouping Using Bandwidth Steering for Distributed Deep Learning Training
11
Embed
Silicon Photonic Switch-Enabled Server Regrouping Using ...
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Ziyi Zhu, Shijia Yan, Madeleine Strom Glick, Min Yee Teh, and Keren Bergman
Silicon Photonic Switch-Enabled Server Regrouping UsingBandwidth Steering for Distributed Deep Learning Training
Rev PA1Rev PA1 2
Motivation
Aggregation
Core
ToR
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Server
1 2 3 4
Servers
Logically grouped servers
Electrical Packet Switch (EPS)
5 6
7
• Under the top-of-rack (ToR) switch the full bandwidth can be utilized, but across racks
constrained bandwidth are experienced
• Distributed deep learning workloads can require many server nodes and show strong
communication patterns between these nodes
Rev PA1Rev PA1 3
Motivation
SiP OCSBandwidth Steering
Above the ToR
Aggregation
Core
ToR
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Server
1 2 3 4
5 6
7
• Previous work [1-2] show silicon photonic (SiP) based bandwidth steering has the capability ofmitigating the bottleneck at the core network level
• However, it is not ideal and does not improve the job locality
Servers
Logically grouped servers
EPS
OCS: Optical Circuit Switch
[1] Michelogiannakis, George, et al. "Bandwidth steering in HPC using silicon nanophotonics." Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. 2019.[2] Shen, Yiwen, et al. "Accelerating of high performance data centers using silicon photonic switch-enabled bandwidth steering." 2018 European Conference on Optical Communication (ECOC). IEEE, 2018.
Rev PA1Rev PA1 4
MotivationFixed servers
Regrouped servers
SiP OCS
SiP OCS
SiP OCS
Bandwidth SteeringAbove the ToR
Server Regrouping
Aggregation
Core
ToR
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Server
1 2 3 4
EPS
5 6
7
• In this work, the SiP OCSs are proposed to be also inserted between the ToR EPSs and
servers
• SiP-enabled bandwidth steering above the ToR switches can still be applied when the
port count of the OCS is limited
Rev PA1Rev PA1 5
Silicon Photonic Switches and Switching
2 x 2 x 2λ Microring-assisted space-and-
wavelength selective switch, reprinted from [3]
64 x 64 Mach–Zehnder interferometer
(MZI)-based switch, reprinted from [4]
240 x 240 switch implemented by micro-
electromechanical system (MEMS)-actuated
directional couplers, reprinted from [5]
• CMOS compatible manufacturing processes
• Small footprint
• Promise for power-efficient and low fabrication cost
interconnects
• Various switching types – spatial, wavelength selective,
space-and-wavelength selective* Demonstrate and develop the control plane of SiP based switching for datacenter
networks
[3] Huang, Yishen, et al. "Push—pull microring-assisted space-and-wavelengthselective switch." Optics letters 45.10 (2020): 2696-2699.[4] Chu, Tao, et al. "Fast, high-radix silicon photonic switches." 2018 OpticalFiber Communications Conference and Exposition (OFC). IEEE, 2018.[5] Seok, Tae Joon, et al. "Wafer-scale silicon photonic switches beyond die sizelimit." Optica 6.4 (2019): 490-494.
modeltorch.distributed: initialize a process group, ip[rank0]: port
initialize another process group, ip[rank0’]: porttorch.distributed.broadcast: distribute weights from PS to workertorch.distributed.reduce: collect gradients from worker to PS
M: MachineW: WeightsG: Gradients
GG
G
Rev PA1Rev PA1 9
Testbed – Baseline, Server Regrouping, and Bandwidth Steering
5 6 9 10
Baseline
Aggregated
EPS1 EPS2 EPS3 EPS4
EPS5 EPS6
EPS7
Server Regrouping + Bandwidth Steering Above the ToR
5 6 9 10
SiP OCS
Released
SiP OCS
EPS1 EPS2 EPS3 EPS4
EPS5 EPS6
EPS7
Server Regrouping
5 6 9 10
SiP OCS
Released
EPS1 EPS2 EPS3 EPS4
EPS5 EPS6
EPS7
Electronic Packet Switches
Servers
GPU Servers
Rev PA1Rev PA1 10
Experimental Setup
Configuration 2
12
1544 1546 1548 1550 1552 1554 1556
-20
-30
-40
-50
-60
-70
Wavelength (nm)P
ow
er
(-d
Bm
)
1544 1546 1548 1550 1552 1554 1556
-20
-30
-40
-50
-60
-70
Po
we
r (-
dB
m)
1544 1546 1548 1550 1552 1554 1556
-20
-30
-40
-50
-60
-70
Wavelength (nm)
Po
we
r (-
dB
m)
1544 1546 1548 1550 1552 1554 1556
-20
-30
-40
-50
-60
-70
Po
we
r (-
dB
m)
1544 1546 1548 1550 1552 1554 1556
-20
-30
-40
-50
-60
-70
Wavelength (nm)
Po
we
r (-
dB
m)
1544 1546 1548 1550 1552 1554 1556
-20
-30
-40
-50
-60
-70
Po
we
r (-
dB
m)
RX1
RX4
RX2
RX5
RX3
RX6
Configuration 1
1544 1546 1548 1550 1552 1554 1556
-20
-30
-40
-50
-60
-70
Wavelength (nm)
Po
we
r (-
dB
m)
1544 1546 1548 1550 1552 1554 1556
-20
-30
-40
-50
-60
-70
Po
we
r (-
dB
m)
1544 1546 1548 1550 1552 1554 1556
-20
-30
-40
-50
-60
-70
Wavelength (nm)
Po
we
r (-
dB
m)
1544 1546 1548 1550 1552 1554 1556
-20
-30
-40
-50
-60
-70
Po
we
r (-
dB
m)
1544 1546 1548 1550 1552 1554 1556
-20
-30
-40
-50
-60
-70
Wavelength (nm)
Po
we
r (-
dB
m)
1544 1546 1548 1550 1552 1554 1556
-20
-30
-40
-50
-60
-70
Po
we
r (-
dB
m)
RX1
RX4
RX2
RX5
RX3
RX6
PC – Polarization Controller EDFA – Erbium-Doped Fiber AmplifierDUX – Optical MultiplexerAMP – Electrical AmplifierDAC – Digital to Analog Converterλ1 = 1545.32nm λ2 = 1546.92nm λ3 = 1553.33nmλ4 = 1554.94nm λ5 = 1554.94nm λ6 = 1556.55nm
Rev PA1Rev PA1 11
Acknowledgement:
This work was partly supported by the U.S. Department of Energy (DoE) SBIR Photonic-Storage Subsystem Input/Output (P-SSIO) Interface Project, by Advanced Research Projects Agency-Energy (ARPA-E) under the Enlightened Project, and by National Security Agency (NSA) Laboratory for Physical Sciences (LPS) Research Initiative