Nuttapong Chakthranont † , Phonlawat Khunphet † , Ryousei Takano ‡ , Tsutomu Ikegami ‡ † King Mongkut’s University of Technology North Bangkok, Thailand ‡ National Institute of Advanced Industrial Science and Technology, Japan IEEE CloudCom 2014@Singapore, 18 Dec. 2014 Exploring the Performance Impact of Virtualization on an HPC Cloud
21
Embed
Exploring the Performance Impact of Virtualization on an HPC Cloud
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Outline �• HPC Cloud• AIST Super Green Cloud (ASGC)• Experiment
• Conclusion
11
Benchmark Programs�Micro benchmark
– Intel Micro Benchmark (IMB) version 3.2.4
Application-level benchmark– HPC Challenge (HPCC) version 1.4.3
• G-HPL
• EP-STREAM
• G-RandomAccess
• G-FFT
– OpenMX version 3.7.4
– Graph 500 version 2.1.4
MPI Point-to-point communication�
13
0.1$
1$
10$
1$ 1024$
Throughp
ut)(G
B/s)�
Message)Size)(KB)�
�Physical$Cluster$�Virtual$Cluster$
5.85GB/s5.69GB/s
The overhead is less than 3% with large message,though it is up to 25% with small message.
IMB
MPI Collectives (64bytes)�
0
1000
2000
3000
4000
5000
0 32 64 96 128Exec
utio
n T
ime
(use
c)�
Number of Nodes �
Physical ClusterVirtual Cluster
0
200
400
600
800
1,000
1,200
0 32 64 96 128
Exec
utio
n T
ime
(use
c)�
Number of Nodes �
Physical ClusterVirtual Cluster
0
2000
4000
6000
0 32 64 96 128Exec
utio
n T
ime
(use
c)�
Number of Nodes �
Physical ClusterVirtual Cluster
Allgather � Allreduce �
Alltoall �
IMB
The overhead becomes significant as the number of nodes increases.
… load imbalance?
+77% +88%
+43%
G-HPL (LINPACK)�
15
0
10
20
30
40
50
60
0 32 64 96 128
Perf
orm
ance
(TFL
OP
S)�
Number of Nodes �
Physical ClusterVirtual Cluster
Performance degradation: ! 5.4 - 6.6%
Efficiency* on 128 nodesPhysical: 90%Virtual: 84%
*) Rmax / Rpeak
HPCC
EP-STREAM and G-FFT �
0
2
4
6
0 32 64 96 128
Perf
orm
ance
(GB
/s)�
Number of Nodes �
Physical Cluster
Virtual Cluster0
40
80
120
160
0 32 64 96 128
Perf
orm
ance
(GFL
OP
S)�
Number of Nodes �
Physical Cluster
Virtual Cluster
EP-STREAM G-FFT
HPCC
The overheads are ignorable.
memory intensivewith no communication
all-to-all communication!with large messages
Graph500 (replicated-csc, scale 26) �
17
1.00E+07
1.00E+08
1.00E+09
1.00E+10
0 16 32 48 64
Perf
orm
ance
(TEP
S)
Number of Nodes
Physical ClusterVirtual Cluster
Graph500
Performance degradation: !2% (64node)
Graph500 is a Hybrid parallel program (MPI + OpenMP).We used a combination of 2 MPI processes and 10 OpenMP threads.
Findings�• PCI passthrough is effective in improving the I/O
performance, however, it is still unable to achieve the low communication latency of a physical cluster due to a virtual interrupt injection.
• VCPU pinning improves the performance for HPC applications.
• Almost all MPI collectives suffer from the scalability issue.
• The overhead of virtualization has less impact on actual applications.
Outline �• HPC Cloud• AIST Super Green Cloud (ASGC)• Experiment
• Conclusion
19
Conclusion and Future work �• HPC Cloud is promising.
– Micro benchmark: MPI collectives have the scalability issue.
– Application-level benchmarks: the negative impact is limited and the virtualization overhead is about 5%.
– Our HPC Cloud operation started from July, 2014.
• Virtualization can contribute to system utilization improvements.– SR-IOV
– VM placement optimization based on the workloads of virtual clusters
20
Question? �
Thank you for your attention!
21
Acknowledgments:The authors would like to thank Assoc. Prof. Vara Varavithya, KMUTNB, and Dr. Yoshio Tanaka, AIST, for valuable guidance and advice. In addition, the authors would also like to thank the ASGC support team for preparation and trouble shooting of the experiments. This work was partly supported by JSPS KAKENHI Grant Number 24700040. �