HPC Trends : Opportunities and ChallengesFrançois Robin, GENCI and CEA, PRACE/WP7 leader
PRACE Industrial Seminar, Amsterdam, September 3, 2008
2
Outline
• Introduction
• HPC [hardware] trends
• Challenges ahead
• PRACE & HPC trends
• Conclusion
3
Introduction• Computer simulation is essential:
– For scientific discovery and for addressing societal challenges– For competitiveness of industry
������������� ������ ������������������������������������������������� ��������������������� ����� ������������������������������������������������������ ��� �� ���������� ��������� ��������������������� ������ ���� ������������� ����� ��� �������� �� �������� �������������
��������������� ����������� ����������������������������!"#�������� ��������������������$�������
�� ��� �� ����%������!��!�������� ������ ������# ���&��!"#������������������� �� ������'�������� ���� �����'������#������������� ������������������������
���������������������������������������������
()������ �������������������������� ������ ��������������������� ������� ����� �� ����� �������������������������� ������� ������ �������������������������������������������� ��������������������������������� ������������������
4
Introduction• Computer simulation is essential:
– For scientific discovery and for addressing societal challenges
– For competitiveness of industry
Cerfacs
CE
A/D
SV
DassaultA
viation
• Progress will, in most cases, depend on being able to perform larger simulations (or faster simulations)– General trend: multiscale and multi physics simulations
• This can be achieved with:– Larger supercomputers (more compute power, more
memory)+ storage, visualisation, …
Topic of this presentation
– Simulation codes and tools able to take advantage of such systems– Experts in HPC, numerical analysis, algorithms, applications, and in
the phenomenon to be simulated working together
5
A long history of continuous increase of supercomputer performances
Warning: This is Linpack performances (close to peak performances), not performances of real applications.- the performance of a real application is usually much lower than the Linpack performance - algorithmic improvements contribute often a lot to the improvement over time of performances of real applications
ASCI Red1 Tflops4 500 processors
LANL RoadRunner1 Pflops≈ 13 000 cores + 100 000 SPE
CRAY XMPCRAY 21 Gflops4 processors
*+,-
*+,.
6
0,01
0,1
1
10
100
1000
10000
100000
1000000
1975 1980 1985 1990 1995 2000 2005 2010
Some CEA production systems
/��� �
/��� ��0�1""
2����
#������� ��21"
#������� ��21"3����������� ��
%��������� �� ��� �
Gflo
ps(p
eak)
4"
2"5 1 ��� �������� ���� ��� ��� ������ ���������5 1""������� ��������������� �������� ��5 4������ ���� ������������������������ �������
5 #�������� ��� �� ��� ��� ����������������������5 6������ ���� ���� ���������������� ��� 1"�5 !����������7�����8 ��� ������7�������������������5 9 ���������� ����� ���7�����������:�;5 $� �� ����� ����� �������� ���������5 ���� ���� �� ��<"'��������� ���� �
7
Complementary and interconnected centres
Tier-0
Tier-1
Tier-2
..…
#������ ��21"=)6�2<��*.>�)���� �=�
#������ ��21"=96� �?1�@>�)���� �=�1""� �?1�*->�)���� �=�
/��� �� 6A#�*�B�)���� �=�
#������ ��21"=)6� ?'CC�*->�)���� �=�:0�D>>�<"')���� �=��2";/��� �� 6A#�D�)���� �=�
8
Outline
• Introduction
• HPC [hardware] trends
• Challenges ahead
• PRACE & HPC trends
• Conclusion
9
Main HPC trends - Findings from PRACE meetings with Top 50 system manufacturers (Feb. 08)
• Several factors have an important impact on computer architecture:
– Performance is never enough while processor speed reaches a limit– Power consumption has become a primary concern– Network complexity/latency is a main hindrance
– There is still the memory wall
• Different architectures are scalable to Petaflop/s in 2009/2010– MPP, Cluster of SMP (Thin-nodes / Fat-nodes), Hybrid system (coarse, fine grain)– None of them is likely to be optimal for all applications– For a specific architecture, no configuration (number of nodes, memory size,
topology of the interconnect ... ) is likely to be optimal for all applications
• The evolution of technology (constraint by acceptable power consumption), will lead to the need of a very high level of parallelism to reach 1 Petaflop/s
• The memory hierarchy gets more complex with widening gaps
����
����
�������� ��� �������� �������:���������� ����� �;����� �������� ��=� �
10
Interconnection network(s)Interconnection network(s)
Architecture: A generic view
Core
Shared memory
Core Core
Acc
eler
ator Core
Shared memory
Core Core
Acc
eler
ator
Com
pute
node
sIO
and
ser
vice IO node IO node Services node
11
Interconnection network(s)Interconnection network(s)
Core
Shared memory
Core Core
Acc
eler
ator
Com
pute
node
sIO
and
ser
vice IO node
Low power µpEx: IBM/BG/P,
≈ 3,5 Gflops/core
Commodity µpEx: Intel/Xeon,
≈ 10 Gflops/core
High performance µpEx: IBM/Power6, ≈ 20 Gflops/core
Vector processorsEx: NEC SX9, ≈ 100
Gflops/core
Accelerators (GPU, Cell, FPGA, …)Ex: AMD FS 9250, ≈ 200 Gflops DP ≈ 150W
Ex: IBM PowerXcell 8i, ≈ 100 Gflops DP ≈ 90WEx: ClearSpeed CSX700 ≈ 100 Gflops DP ≈ 25W
Commodity thin-nodes≈ 2-4 sockets / node
High performance fat-nodes > 2-4 sockets / node
SSD storage≈ 10s GB
SAS disks≈ 1 TB
SATA disks>≈ 2 TB
Software RAID Intelligent storage
Commodity interconnectEx: IB-DDR or QDR
Specific interconnectEx: SGI/NL, IBM BG/x, CRAY XTx
Ingredients for a Petaflop/s system in 2009/2010
����
���
� �
������
����
�� ���
���
:�E7�?
'CC=?
$2��
����
� ��F
���2 ����
;
12
Evolution of processors
Pro
gram
mab
ility
Performance
Single core Multi-core Many-
core
GPU
CPU
Bas
ed o
n an
Inte
l pre
sent
atio
n at
SIG
GR
AP
H 2
008
CPU• Evolving toward multi-
core• Motivated by energy-
efficient performance and by limitation of ILP
• Trend: 2x #cores every 18 months
2007/20084 cores
2/3.2 Ghz
2008/20092-8 cores
2006/20072 cores
1.8/3 Ghz
2005/20062 cores
2.5/3.7 Ghz
Richard D
racott, INTE
L, ISC
2008R
andy Allen, A
MD
, May 2008
20084 cores
201012 cores
20052 cores
20096 cores
13
Evolution of Accelerators
Pro
gram
mab
ility
Performance
Many-
core
GPUFPGA
CPU
Fully programmable
Partially programmable
Fixed function
Bas
ed o
n an
Inte
l pre
sent
atio
n at
SIG
GR
AP
H 2
008
Outstanding performance/price and performance/electricity ratio for well suited and programmed applications
6���)����)*>*�)� ��2"�G>�*�)� ��4"
$14�9��2�����+D.>�*�)� ��2"��>�D�)� ��4"
�?1�" ���H#�,G>�*�)� ��4"
#���2����#2HI>>�>�*�)� ��4"
GPU• Evolving toward general-purpose computing• Addressing the HPC market (data-parallel
programming)FPGA• Less flexible but best performance/watt
14
Integration of accelerators into compute nodes
Pro
gram
mab
ility
Performance
CPU
Bas
ed o
n an
Inte
l pre
sent
atio
n at
SIG
GR
AP
H 2
008
GPUFPGA
Goal: reduce overhead by speeding-up/limiting
data transfers
AMD• Torrenza initiative: use HyperTransport
connectionINTEL• Plug accelerators into processor sockets
Goal: reduce overhead by speeding-up/limiting data transfers
,�<?=�*>>���
.�<?=��� I.>���:E*@�"#��;
15
Future trends (1/2)• Processors and accelerators
– Hybrid-multicore• Large/small cores• Cores and accelerators, …
– Many-core• Large number of small cores on a chip • With possibly specific hardware (graphic operations)
• Memory:– Possible future contenders
• Magnetic RAM (MRAM): fast, permanent• Z-RAM: fast, very dense
– Closer integration between processor and memory• PIM• 3D stacking or flex cables
�����������BD:J;�E,@�� ����
�������� �������:*@����;
D>>+=D>*>;
PackagePackage
DRAMDRAMCPUCPU
HeatHeat--SinkSink
16
Future trends (2/2)• Increased of optics interconnect
– Optical interconnect– Silicon photonics - integration of photonics on chip– Products from Luxtera and Lightfleet
• Increasing importance of data and IOs– Towards data-centric computing centre– Distributed and parallel file systems– Data integrity (ZFS)
• Heterogeneous configuration– Nodes– Interconnection network
# � ���� ������� �������
����� ������
17
Outline
• Introduction
• HPC [hardware] trends
• Challenges ahead
• PRACE & HPC trends
• Conclusion
18
Typical configuration of a Petaflop/s systems in 2009/2010
• 10 Gflops/core• 6-8 cores/socket • 2-4 sockets/node
• 100 000 cores• 1000's of nodes• Possibly accelerators
– Coarse (vector-scalar)– Fine (GPU, Cell, FPGA, ..)
• Possibly heterogeneous– Node configuration– Network architecture– …
• 80 to 300 m2• 2 to 10 MW• Mostly water cooling
• Floor space doesn't include:– Space for storage systems (disks / tapes)– Space needed for installation
(unpacking/testing/…) and during the installation of a new system
– Space for electrical and mechanical rooms
• Electricity doesn't include:– Storage and other peripheral systems– Cooling, loss in power supply (UPS, …)
19
Challenges
• Major challenges– Performance (computation and
IO)• Single processor performance
– TCO (Total cost of ownership) • Electricity
– Programmability– Scalability– Reliability
• Some ways to address these challenges
– Programming Petaflop/s systems– Cost of electricity– Reliabiliby
– The PRACE approach– Area of interest of PRACE prototypes
• Accelerators• Many-cores• Languages for programming
accelerators • Parallel languages• Advanced IO• Low power systems
20
Programming Petaflop/s system (1/2)
• Accelerators:– Accelerators are potentially faster than processors because they give to the users
complete control over:• Scheduling: Multiple data-parallel units• Transfer of data between memories and caches
– Specific languages: CUDA for Nvidia, Clearspeed SDK, …– Some languages are trying to target several accelerators and multicore chips:
CAPS/HMPP, RapidMind, OpenCL (Apple + AMD),
• Parallel programming languages:– OpenMP and MPI– PGAS languages gaining acceptance and performance with hardware support:
• Simpler distributed memory programming• CAF (co-array Fortran - included in Fortran 2008 standard) and UPC (Unified Parallel C)
– tab(i)[j]: element i of tab on processor j
– Longer term: DARPA/HPCS: Chapel (CRAY), X10 (IBM), [Fortress (SUN)]
21
Programming Petaflop/s system (2/2)• Tools:
– Debuggers: Totalview, DDT, …– Profilers: OPT, OpenSpeedShop, …
• Libraires– Application: mathematical, …– Run-time (threads, …)
• ISV applications: strong trends towards parallelism– PRACE is willing to cooperate with ISV to foster this trend
• Challenges:– Parallelism: How to scale to 100 000 ways ?– Dealing with a complex memory hierarchy– Complexity and heterogeneity– Portability and durability of applications– Training and education
22
Cost of electricity
Source : NusConsulting Group, Study on the cost of Electricity in Europe in 1997, May 1997
Cost of 1 MW-year in 1997
- €
100 000 €
200 000 €
300 000 €
400 000 €
500 000 €
600 000 €
700 000 €
800 000 €
900 000 €
Cost of a MW-year 860 000 € 830 000 € 730 000 € 680 000 € 560 000 € 460 000 €
Germany Netherlands UK SpainFrance (marché
dérégulé)France (marché
régulé)
Cost of 1 MW-year / CEA computing center
- €
100 000 €
200 000 €
300 000 €
400 000 €
500 000 €
600 000 €
700 000 €
800 000 €
900 000 €
2005 2006 2007 2008 2009 2010 2011 2012
23
Energy efficiency• IT equipments
– "Green" (power efficient) components: processors, disks, power supplies, fans, …– Power chain efficiency: avoid several voltage conversion– Water cooling: cooling doors or direct cooling of electronic components
• Computer centre (PUE = Total_Facility_Power/IT_Electrical_Power)– Power supply
• Power efficient UPS (5-10% power loss)• UPS only for critical elements
– Cooling system• Power efficient chillers• Increase of operation margins (temperature/hygrometry)• "Free" cooling using outside air• Use of heat produced to heat offices
24
Pow
er e
ffici
ency
of s
yste
ms
in th
e To
p50
0
50
100
150
200
250
300
350
400
450
500
LANL (
IBM)
IBM (I
BM)EDF (I
BM)ANL (
IBM)
FZJ (IB
M)
IDRIS
(IBM)
Univ. U
mea (I
BM)
TOTAL (SGI)
FZJ (IB
M)
LLNL (
IBM)
IBM (I
BM)BNL (
IBM)
Renss
aler I
ns. (
IBM)
R-Sys
tes (D
ell)
MHPCC (Dell
)
TACC (Dell
)
NMCAC (SGI)
NCSA (Dell
)
Univ. T
suku
ba (A
ppro
Intl)
Univ. K
yoto
(Fuji
tsu)
Univ. T
okyo
(Hita
chi)
TATA (HP)
BSC (IBM)
CECMWF (I
BM)
NCAR (IBM)
Gov. (
HP)
Univ. G
dans
k (ACTIO
N)
Univ. M
osco
w (T-P
latfor
ms)RZG (I
BM)
CEA (Bull
SA)
DoD A
SC (SGI)
ECMWF (I
BM)
TACC (Sun
)
ARL (Lin
ux N
etwor
x)LR
Z (SGI)
ERDC MSRC (C
ray)
ERDC MSRC (C
ray)
Univ. B
erge
n (Cra
y)
ORNL (Cra
y)
NASA (SGI)
GSIC (N
EC/Sun
)
SNL (Cra
y)NSC (H
P)
LLNL (
IBM)
LBNL (
Cray)
LLNL (
Appro
Intl)
CEA (Bull
SA)
SNL (Dell
)
Univ. E
dinbu
rg (C
ray)
Earth
Simula
tor (N
EC)
IBM Cell
IBM BG/P
IBM/Xeon
IBM BG/L
SGI/Xeon
Warning: The electricity consumption depends greatly on memory and disk configuration while performance of some systems may not be usable for all applications (accelerators)
Recommendation: In the context of a procurement, use TCO as selection criteria
Examples of electricity cost (5 years life time, 0.7M€/MW-year):- LANL (Cell): 4.2 M€- Earth Simulator: 22 M€
25
Reliability• What is important is the availability of the resources from the user point of view
(computing and data)– IT equipments– Infrastructure
• Reaching a good availability ratio – Avoid failures
• Reliable components• Monitoring / preventive maintenance - repair failures before they are visible to the users
– Make the failure of a component transparent• Redundancy / fail-over
– Limit the impact of a failure to a subset of the system– Use checkpoint-restart to reduce the impact of a job crash
• Application checkpoint-restart• System chekpoint-restart with migration (not there yet )
• There are always faulty components in a large supercomputer !
5 A����������������������� ���� ������� ����������������������
5 :'�;�������� ����������8 ��� ���������������
26
Outline
• Introduction
• HPC [hardware] trends
• Challenges ahead
• PRACE & HPC trends
• Conclusion
27
PRACE and HPC trends• For PRACE, understanding the major HPC trends is of critical
importance for both:– taking advantage of future promising technologies and architectures– playing an active role in the European HPC ecosystem and its relationship with
international high end initiatives– interacting with vendors and fostering their presence in Europe
• Main actions:– Meeting with vendors (WP7+WP8) : system manufacturers (Feb 08), processors
and accelerators (Sep 08), Network and IO (Oct 08), …– Prototypes: 2009/2010 target (WP7/WP5), beyond 2010 target (WP8)– AHTP (WP8): Advanced HPC technology platform– Survey of HPC centres and installation requirements for Petaflop/s systems– Exchanges between PRACE partners
28
PRACE Prototypes: Essential tools for preparing the design and the deployment of the future HPC infrastructure
• Initial deployment of the PRACE infrastructure (2009/2010)
• Match current and future application requirements with vendor roadmaps
• Ensure an easy integration of the future supercomputers: – in the PRACE computing centres – in the PRACE infrastructure
• Six prototypes selected and will be deployed in the next months
• Preparation of the evolution of the PRACE infrastructure after 2010
• Early evaluation of technologies and architectures
• Interact with vendors on the basis of a common set of European requirements
• Opportunities of higher performances with evolutions of applications:– Programming models and languages– Models and numerical kernels
• Selection is in progress
Coordinated selection and the experimentation of the prototypesand feedback towards vendors
29
Outline
• Introduction
• HPC [hardware] trends
• Challenges ahead
• PRACE & HPC trends
• Conclusion
30
HPC trends: opportunities and challenges
• Future leading edge supercomputers will provide a greatly increased performance and will be fantastic tools for research and industry
• PRACE is working on the challenges that will be necessary to address in order to be able to influence and to make the best use of future architectures and technologies
• PRACE gathers the best competencies of European HPC centres. Thecollaboration between computing centres, sharing of experience and work in common is key to the success of PRACE
• PRACE and the European HPC ecosystem would benefit from extending this collaboration to HPC industry and industrial HPC users
• This will contribute to strengthening this HPC ecosystem which is of strategic importance to the competitiveness of Europe.
31
Credits
Aad van der Steen, Stéphane Requena, Alain Lichnewsky, Jean-Philippe Nominé, Jean Marie Normand, Thomas Eickermann, Hervé Lozach, Jacques-Charles Lafoucrière, Jean-Pierre Delpouve, Jacques David, ..