Top Banner
Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update
30

Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

Jan 01, 2016

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

Angèle Simard

Canadian Meteorological Center

Meteorological Service of Canada

MSC ComputingUpdate

Page 2: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

2

Canadian Meteorological

Centre

National Meteorological Centre for Canada

Provide services to National Defence and Emergency organizations

International WMO commitments ( emergency response, Telecomm, data, etc…)

Supercomputer used for NWP, climate, air quality, etc… ; operations and R&D

Page 3: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

3

Outline

Current Status

Supercomputer Procurement

Conversion Experience

Scientific Direction

Page 4: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

Current Status

Page 5: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

5

NECSX-532M2

NECSX-4/80M3

NEC SX-3/44R

Cray1SCDC

176

CrayXMP 416

CDC 7600

NEC SX-3/44

NEC SX-6/80M10

1

10

100

1000

10000

100000

1000000

1974

1976

1978

1980

1982

1984

1986

1988

1990

1992

1994

1996

1998

2000

2002

MF

LO

PF

s (P

EA

K)

CrayXMP 28

MSC’s Supercomputing History

Page 6: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

6

CMC Supercomputer Infrastructure

Supercomputer

High Speed Network

1000 Mb/s switched128 ports. Each host has links

to ops & dev nets

SGI O300: 8 PEs

1 TB4xFC

Central File Server

LSIRAID

ADIC AML-E (Tape Robot)

145 TB4 DST drives20 MB/s ea.

Front Ends

NetappF840.05 TB

LinuxCluster

12 DL380

NECSX-6/80M10

3.6 TB RAIDFC switched

128 ports

SGIOrigin 3000

Page 7: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

7

Supercomputers: NEC

NEC

– 10 node

– 80 PE

– 640 GB Memory

– 2 TB disks

– 640 Gflops (peak)

Page 8: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

8

0

20

40

60

80

100

120

140

160

180

200

1996 1997 1998 1999 2000 2001 2002

YEAR

TB

Archive Size

New data Written

Read

Archive Statistics

Page 9: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

9

Automated Archiving System

22,000 tape mounts/month

11 Terabytes growth/month

maximum of 214 TB

Page 10: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

SupercomputerProcurement

Page 11: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

11

RFP/Contract Process Competitive process

Formal RFP released in mid-2002

Bidders need to meet all mandatory requirements to make it to next phase

Bidders were rejected if bids above the funding envelope

Bidders were rated on performance (90%) and price (10%)

IBM came first

Live preliminary benchmark at IBM facility

Contract awarded to IBM in November 2002

Page 12: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

12

1.042

2.47996

5.89251

0

1

2

3

4

5

6

0 2.5 5

Initial, Upgrade & Optional Upgrade

Su

stai

ned

TF

lop

s

IBM

IBM Committed Sustained Performance in MSC TFlops

Page 13: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

13

On-going Support

The systems must achieve a minimum of 99 % availability.

Remedial maintenance 24/7: 30 minute response time between 8 A.M. and 5 P.M. on weekdays. One hour response outside above periods.

Preventive maintenance or engineering changes: maximum of eight (8) hours a month, in blocks of time not exceeding two (2) or four (4) hours per day (subject to certain conditions) .

Software support: Provision of emergency and non-emergency assistance.

Page 14: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

14

CMC Supercomputer Infrastructure

High Speed Network

1000 Mb/s switched128 ports. Each host has links

to ops & dev nets

SGI O300: 8 PEs

1 TB4xFC

Central File Server

LSIRAID

ADIC AML-E (Tape Robot)

145 TB4 DST drives20 MB/s ea.

Front Ends

NetappF840.05 TB

LinuxCluster

12 DL380

SGIO3000's24 PEs

(MIPS R12K)

Supercomputers

IBM SP -832 PE & 1.6TB

Page 15: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

15

Supercomputers: Now and Then…

NEC

– 10 node

– 80 PE

– 640 GB Memory

– 2 TB disks

–640 Gflops (peak)

IBM

– 112 nodes

– 928 PE

– 2.19 TB Memory

– 15 TB Disks

– 4.825 Tflops (peak)

Page 16: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

Conversion Experience

Page 17: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

17

Transition to IBM

Prior to conversion:

Developed and tested code on multiple platforms (MPP & vector)

Where possible, avoided proprietary tools

When required, hid proprietary routines under a local API to centralize changes

Following contract award, obtained early access to the IBM (thanks to NCAR)

Page 18: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

18

Transition to IBM

Application Conversion Plan:

Plan was made to comply to tight schedule

Plan’s key element included a rapid implementation of an operational “core” (to test mechanics)

Phased approach for optimized version of models

End to End testing scheduled to begin mid-September

Page 19: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

19

Scalability

Rule of Thumb

To obtain required wall-clock timings:

Require 4 times more IBM processors to obtain same performance with NEC SX-6

Page 20: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

20

Timing (min) Configuration NEC IBM NEC IBM

GEMDM regional 38 34 8 MPI 8 MPI x4 OpenMP (48 hour) (32 total)

GEMDM Global 35 36 4 MPI 4 MPI x4 OpenMP (240 hour) (16 total) 3D-Var(R1) 11 09 4 6 CPU OpenMP (without AMSU-B) 3D-Var (G2) 16 12 4 6 CPU OpenMP (with AMSU-B)

Preliminary Results

Page 21: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

SCIENTIFIC DIRECTIONS

Page 22: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

22

Global System

Now: Uniform resolution of 100 km (400 X 200 X 28) 3D-Var at T108 on model levels, 6-hr cycle Use of raw radiances from AMSUA, AMSUB and GOES

2004: Resolution to 35 km (800 X 600 X 80) Top at 0.1 hPa (instead of 10 hPa) with additional AMSUA and AMSUB channels 4D-Var assimilation, 6-hr time window with 3 outer loops at full model resolution and inner loops at T108 (cpu equivalent of a 5- day forecast of full resolution model) new datasets: profilers, MODIS winds, QuikScat

2005+: Additional datasets (AIRS, MSG, MTSAT, IASI, GIFTS, COSMIC) Improved data assimilation

Page 23: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

23

Regional System

Now: Variable resolution, uniform region at 24 km (354 X 415 X 28) 3D-Var assimilation on model levels at T108, 12-hr spin-up

2004: Resolution to 15 km in uniform region (576 X 641 X 58) Inclusion of AMSU-B and GOES data in assimilation cycle Four model runs a day (instead of two) New datasets: profilers, MODIS winds, QuikScat

2006: Limited area model at 10 km resolution (800 X 800 X 60) LAM 4D-Var data assimilation Assimilation of Radar data

Page 24: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

24

Ensemble Prediction System

Now: 16 members global system (300 X 150 X 28) Forecasts up to 10 days once a day at 00Z Optimal Interpolation assimilation system, 6-hr cycle,use of derived radiance data (Satems)

2004: Ensemble Kalman Filter assimilation system, 6-hr cycle,

use of raw radiances from AMSUA, AMSUB and GOES Two forecast runs per day (12Z run added) Forecasts extended to 15 days (instead of 10)

Page 25: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

25

...Ensemble Prediction System

2005:

Increased resolution to 100 km (400 X 200 X 58) Increased members to 32 Additional datasets such as in global deterministic system

2007:

Prototype regional ensemble 10 members LAM (500 X 500 X 58) No distinct data assimilation; initial and boundary conditions from global EPS

Page 26: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

26

Mesoscale System

Now: Variable resolution, uniform region at 10 km (290 X 371 X 35)

Two windows; no data assimilation

2004: Prototype Limited area model at 2.5 km (500 X 500 X 45)

over one area

2005: Five Limited area model windows at 2.5 km (500 X 500 X 45)

2008: 4D data assimilation

Page 27: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

27

Coupled Models

Today

In R&D: coupled atmosphere, ocean, ice, wave, hydrology, biology & chemistry

In Production: storm surge, wave

Future

Global coupled model for both prediction & data assimilation

Page 28: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

28

Estuary & Ice Circulation

Page 29: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

29

Merchant Ship RoutingP

uiss

ance

Temps …2 jours...

Route sudRoute sud

Route nordRoute nord

Page 30: Angèle Simard Canadian Meteorological Center Meteorological Service of Canada MSC Computing Update.

30

Challenges

IT Security

Accelerated Storage Needs

Replacement of Front-end & Storage Infrastructure