Top Banner
Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics
21

Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

Mar 26, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

Specialized Supercomputers

Piero ViciniINFN

Istituto Nazionale di Fisica NucleareItalian National Institutefor Nuclear Physics

Page 2: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

Dedicated SuperComputing

• WHY– The Scientific Case– Custom vs Commodity– Italian Experience

• APE project

• HOW TO – The international scenario– Petaflops machine

• Some ideas

• TOOLS– EU funding– National funding

Page 3: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

SuperComputing: the Scientific Case

Large Scale numerical applications

– Astrophysics and Plasma Physics

• Today: 70-100 TF/s, 2009: >500 TFs/s

• Dedicated architecture: Grape (Japan/Europe)

– High-Energy Physics (LQCD)

• Today: 10-50 TF/s, several projects 2009: 500-1000 TFs/s

• Dedicated architecture: APE (Europe), QCDOC(USA/UK)

– Weather, Climatology, Earth sciences

• Today: 10-30 TF/s, 2009: several projects per 200-300 TF/s aggregated power

• Dedicated architecture: Earth Simulator (Japan)

– Life Sciences (molecular dynamics, protein folding, in silico drug design,…)

• Today:…., 2009-2010: > N*Petaflops

• Dedicated architecture: IBM Blue/Gene (USA)

– .........

Page 4: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

Dedicated vs General Purpose Parallel Machine

• Processor level -> very well balanced architecture– Computing unit designed to be very efficent on kernel of

(several) classes of applications – Integration of “unusual” memory interfaces based on large

register File, huge multiport,…. – Integration of optimized interconnection network (low

latency, high bandwidth)

* Communication overhead not included

Eff. (H) 0.56 0.53 0.27* 0.11 0.42 0.05

QCDbenchmarks

Page 5: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

Dedicated vs General Purpose Parallel Machine(2)

• System level:Dense, safe and cheap systems

– Very high ratio of Flops/Watt

– Very high ratio of Flops/Volume

– Cost effective systems• 0.5 €/Mflops• Very low cost

maintenance

3670

80

46

apeNEXT

3670

72

50972

apeNEXT

Page 6: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

The Italian experience: ape project

Our line of Home Made Computers …

APE(1988)

APE100(1993)

APEmille(1999)

apeNEXT(2004)

Italian research team

Italian research

team

European research team

+Industry(QSW,Eurotech)

European research team+

Industry(Eurotech)

Architecture SIMD SIMD SIMD SIMD++

comp. nodes 16 2048 2048 4096

Interc. Topology

flexible 1D rigid 3D flexible 3D flexible 3D

Memory size 256 MB 8 GB 64 GB 1 TB

registers(w.size)

64 (x32) 128 (x32) 512 (x32) 512 (x64)

Clock speed 8 MHz 25 MHz 66 MHz 200 MHz

Peak power 1 GFlops 100 GFlops 1 TFlops 7 TFlops

Page 7: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

apeNEXT architecture

•3D mesh of computing nodes

• Custom VLSI processor - 200 MHz (J&T)

• 1.6 GFlops per node (complex “normal”)

• 256 MB (1 GB) memory per node

•First neighbor communication network “loosely synchronous”

•YZ internal, X on cables

•r = 8/16 => 200 MB/s per channel

•Scalable 25 GFlops -> 6 Tflops• Processing Board 4 x 2 x 2 ~ 26 GF• Crate (16 PB) 4 x 8 x 8 ~ 0.5 TF• Rack (32 PB) 8 x 8 x 8 ~ 1 TF• Large systems (8*n) x 8 x 8

•Linux PCs as Host system

Z+(bp)

Y+(bp)

X+(cables)

•0 •2

•4 •6

•8 •10

•12 •14

•1 •3

•5 •7

•9 •11

•13 •15

•J&T

•DDR-MEM

•X+

•…•…•Z-

Page 8: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

Evaluating the success of APE(1)

Apemille (2000):Italy 1365 GFGermany 650 GFUK 65 GFFrance 16 GF

Total 2 TF

apeNEXT (2005):Development costs = 2000 k€uro 1100 k€uro VLSI NRE 250 k€uro non-VLSI NRE 650 k€uro prototype procurement Manpower = 20 man/yearMass production cost ~ 0.5 €uro/Mflops Installations:

Italy 10.6 TF

Germany 8.0 TF

France 1.6 TFTotal 20.2 TF

Page 9: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

Evaluating the success of APE(2)

• Scientific, technological and social impacts:– APE is standard “de facto” in European LQCD computing area– Huge number of scientific and technological (HW, SW, Architecture)

papers– Establishment of an international computing facility fully

dedicated to scientific numerical computing• Laboratorio di Calcolo apeNEXT: 12 TFs installed, opening on February,

8th

– Strategic opportunities to increase national(European) industry capability• Eurotech

– INFN collaboration -> HPC division, market expansion, international visibility• Finmeccanica/QSW

– Training, dissemination and establishment of spin-off company• Atmel/Ipitec• Nergal • Digital Video• Venere

Page 10: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

What’s next after apeNEXT?: scenario

• In the future (2010) the required computing platform for numerical “large-scale” applications will be of the order of PetaFlops

• The International scenario – Today (www.top500.org):

• IBM Blue/Gene: dedicated architecture (very similar to APE….), N*100TFlops

• Earth Simulator: N*10TFlops• PC Clusters approach: N*10TFlops

– Future (2010 and beyond):• USA: IBM, Blue/Gene evolution, N*Petaflops• Japan: NEC/Hitachi/University, 3 Petaflops per biotech and nanotech,

custom silicon, custom interconnect• Japan: Fujitsu, 3 Petaflops, cluster approach with optical

interconnection

• Europe?

Page 11: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

Brainstorming• Silicon shrink

– apeNEXT: 0.18 um – today: 0.13um – Next years: 0.90 – 0.65 um

1319

78

39

0

10

20

30

40

50

60

70

80

90

0.18 0.13 0.09 0.06

Silicon Process (um)

Area (mm2)

Die area per FP Node

Worst case: 6 computing Nodes per chip (Tiled architecture)

Page 12: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

Brainstorming(2)• Performance scaling

– Clock frequency scales with silicon process– Power consumption decrease with silicon process (est. 0.3 W/Gflops)– Architecture: Multi-Tiles versus Single-Tile

1,6

4,8

13,1

24,0

1,6 2,4 3,2 4,0

0,0

5,0

10,0

15,0

20,0

25,0

30,0

0:18 0:13 0:09 0:06 Silicon Process (um)

GFs

Multi-Tiles Perf (GFs)Single-Tile perf (GFs)

Page 13: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

Brainstorming(3)• “Smart memory architecture” and new “3D Engineering”

– On chip large and hierarchical memory buffers -> reduction of components per board– Processing board “sandwich” (stacked) -> surface distributed network connectors – 512 FP Nodes per board

Worst case: Factor 100 in 5 years…

26,239,3

52,4 65,5

0,81,2

1,6 2,0

26,2

78,6

215,2393,2

0,1

1,0

10,0

100,0

1000,0

0:18 0:13 0:09 0:06 Silicon Process

TFs

"3D Eng Rack Comp. Power"

"apeNEXT Eq. Rack Comp. Power"

3D Eng. Multi Tiles

apeNEXT rack

Page 14: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

PetaFlops class computer proposal

• Leverage on European leadership in embedded processor technology

• European collaboration (research + industry) to design a new computing architecture for scientific and engineering numerical applications

• Parameters:– (Less) dedicated architecture suitable for future great

challenging applications– 0.5/1 PetaFlops system (factor 50 better than apeNEXT)– 300W/TeraFlops– 10KEuro/Teraflops (factor 50 better than apeNEXT)– Programming environment to produce parallel code with very

high efficiency

Page 15: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

Tools(1)

• EU Level– FP6 and beyond

• SHAPES (Scalable sw/Hw Architecture Platform for Embedded Systems)

– FP6-2004-IST-4 2.3.4(viii) Advanced Computing Architectures

– Partners: INFN-Roma, ATMEL-ROMA, ST, TIMA(FR), TARGET COMPILER(BE)….

– Target: technology R&D to study feasibility of 2TFs board in 4 years (Tiled architecture, NoC, Off-chip network and 3D Engineering multi board system)

• HPC Europe Initiative

– Joint action at EU level (France, Germany, UK, Spain + NederLand, Finland,Italy) to consolidate European role in supercomputing applications and to ensuring the availability of the most advanced supercomputer systems in the EU

– Main target: In 2010, 4 computing centre in Europe equipped with “general purpose” (->not-european) supercomputers

– 800 MEuro (!) partially funded by EU and national governements

Page 16: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

Tools(2)

• National Level– PNR

• “High Performance Computing for scientific and engineering applications: architecture, hardware, development software and selected applications”– Partners: INFN, EUROTECH, CNR(MI), CILEA, SISSA, UNI MI

BICOCCA,UNI PADOVA– Target: Petaflops supercomputer suitable for

engineering and scientific applications – 200 W/Teraflops, 10 KEuro/Petaflops– Development cost (+ prototype procurement): 20 Meuro +

40 man/years– Project duration: 4 years

Page 17: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

Backup slides

Page 18: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

apeNET: status ed attivita’ future

• Testato con successo un cluster di 16 PCs interconnessi via apeNET– Performance misurate >800 MB/s

(send-receive) per direzione

• Ottimizzazioni SW/HW/FW– RDMA, Network driver, LAM/MPI

• INFN Roma2 ha finanziato un cluster da 128 nodi (dual Xeon + apeNET)– Fornitura prevista per Settembre

2005

• Attivita’ future:– PCI-X -> PCI Express – Integrazione di uP core su FPGA– Sviluppo applicazioni QCD e

dinamica molecolare

Page 19: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

apeNEXT: l’ultima generazione

• Network di comunicazione a primo vicino debolmente sincrona

• Sistema scalabile da 25 GF a 6 TF – 16 processori per scheda (PB)– Sistemi 8x8x16=1024 nodi o 8x8x64=4096 nodi

• “Host system” realizzato con PC (Linux)

Z+(bp)

Y+(bp)

X+(cables)

0 2

4 6

8 10

12 14

1 3

5 7

9 11

13 15

J&T

DDR-MEM

X+

……… Z-

• Reticolo 3D di 4096 “nodi di calcolo” (6.5 TF)

– Processore custom VLSI- 200 MHz (J&T)

– 1.6 GFlops per nodo (a*b+c su dati complessi)

• 4Q03: !! apeNEXT running !!

Page 20: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

apeNEXT

J&T Chip layout

PCI Host interfaceProcessing board

Next Rack

Page 21: Specialized Supercomputers Piero Vicini INFN Istituto Nazionale di Fisica Nucleare Italian National Institute for Nuclear Physics.

APENET

• Network d’interconnessione per PC cluster con topologia 3D toroidale per cluster di PC– apeLINK: PCI-X (133MHz) board

• 6 link LVDS, bidirezionali e full-duplex

• 700 MB/s per link per direzione (-> 8.4GByte/s)

• Link basati su National Instr. SERDES– Capacita’ di routing e switching

integrata– Alta banda passante e bassa latenza

grazie all’adozione di un protocollo “leggero”