SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO Gateways to Discovery: Cyberinfrastructure for the Long Tail of Science XSEDE’14 (16 July 2014) R. L. Moore, C. Baru, D. Baxter, G. Fox (Indiana U), A Majumdar, P Papadopoulos, W Pfeiffer, R. S. Sinkovits, S. Strande (NCAR), M. Tatineni, R. P. Wagner, N. Wilkins-Diehr, M. L. Norman UCSD/SDSC (except as noted)
33
Embed
Gateways to Discovery: Cyberinfrastructure for the …...SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA; SAN DIEGO Gateways to Discovery: Cyberinfrastructure for the
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA; SAN DIEGO
Gateways to Discovery:
Cyberinfrastructure for the Long Tail of Science
XSEDE’14 (16 July 2014)
R. L. Moore, C. Baru, D. Baxter, G. Fox (Indiana U), A Majumdar, P Papadopoulos, W Pfeiffer, R. S. Sinkovits, S. Strande (NCAR), M.
Tatineni, R. P. Wagner, N. Wilkins-Diehr, M. L. Norman
UCSD/SDSC (except as noted)
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA; SAN DIEGO
HPC for the 99%
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA; SAN DIEGO
Comet is in response to NSF’s solicitation (13-528) to
• “… expand the use of high end resources to a
much larger and more diverse community
• … support the entire spectrum of NSF
communities
• ... promote a more comprehensive and
balanced portfolio
• … include research communities that are not
users of traditional HPC systems.“
The long tail of science needs HPC
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA; SAN DIEGO
Jobs and SUs at various scales across NSF resources
0
500
1000
1500
2000
2500
3000
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1 2 4 8 16 32 64 128 256 512 1K 2K 4K 8K 16K
Mil
lio
ns
of
XD
SU
s C
ha
rge
d
Fra
ctio
n o
f A
ll J
ob
s C
ha
rge
d in
20
12
Job Size (Cores)
Percentage of Jobs (Left Axis)
SUs Charged (Right Axis)
One node
• 99% of jobs run on
NSF’s HPC
resources in 2012
used <2048 cores
• And consumed
~50% of the total
core-hours across
NSF resources
Job Size (Cores)
Cu
mu
lati
ve U
sag
e
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA; SAN DIEGO
Comet Will Serve the 99%
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA; SAN DIEGO
Comet: System Characteristics • Available January 2015
• Total flops ~1.8-2.0 PF
• Dell primary integrator
• Intel next-gen processors, former
codename Haswell, with AVX2
• Aeon storage vendor
• Mellanox FDR InfiniBand
• Standard compute nodes
• Dual Haswell processors
• 128 GB DDR4 DRAM (64
GB/socket!)
• 320 GB SSD (local scratch)
• GPU nodes
• Four NVIDIA GPUs/node
• Large-memory nodes (Mar 2015)
• 1.5 TB DRAM
• Four Haswell processors/node
• Hybrid fat-tree topology
• FDR (56 Gbps) InfiniBand
• Rack-level (72 nodes) full bisection
bandwidth
• 4:1 oversubscription cross-rack
• Performance Storage
• 7 PB, 200 GB/s
• Scratch & Persistent Storage
• Durable Storage (reliability)
• 6 PB, 100 GB/s
• Gateway hosting nodes and VM
image repository
• 100 Gbps external connectivity to
Internet2 & ESNet
SAN DIEGO SUPERCOMPUTER CENTER
at the UNIVERSITY OF CALIFORNIA; SAN DIEGO
Comet Architecture
Juniper 100 Gbps
Arista 40GbE
(2x)
Data Mover (4x)
R&E Network Access Data Movers
Internet 2
7x 36-port FDR in each rack wired as full fat-tree. 4:1 over subscription between racks.