Sponge v2

The Monash Campus Grid Programme

Enhancing Research with High-Performance/High-Throughput Computing

What is HPC? High-performance computing is about leveraging the

best and cost-effective technologies, from processors, memory chips, disks and networks, to provide aggregated computational capabilities beyond what is typically available to the enduser

high-performance -- running a program as quickly as possible

high-throughput -- running as many programs as quickly as possible within a unit of time

HPC/HTC are enabling technologies for larger experiments, more complex data analyses, achieving higher accuracy in computational models

Information Technology Services Division

Office of the CIO

The Monash Campus Grid

LaRDS (peta-scale storage)

Monash Gigabit Network

Monash Sun Grid HPC Cluster

Monash SPONGE Condor Pool

Nimrod Grid-enabled Middleware

https://confluence-vre.its.monash.edu.au/display/mcgwiki/Monash+MCG

GT2 / GT4Secure Shell

Secure Copy

GridFTP


Office of the CIO

WWW

http://www.sun.com/servers/entry/x2100/

Monash Sun Grid Central high-performance compute cluster (HPC + HTC capable)

Monash eResearch Centre and Information Technology Services Division

Key features:

dedicated Linux cluster with ~205 computers providing ~1,650 CPU cores

processor configurations from 2 CPU cores to up to 48 CPU cores per computer

primary memory configurations from 4 GB to up to 1,024 GB per machine

broad range of applications and development environments

flexibility in addressing our customer requirements

https://confluence-vre.its.monash.edu.au/display/mcgwiki/Monash+Sun+Grid+Overview


Office of the CIO


Monash Sun Grid node types

2005

2006

2008-10

2009

2008-10

2010


Office of the CIO







Monash Sun Grid 2010 Very-Large RAM nodes

2010

Dell R910 - four eight-core Intel Xeon (Nehalem) CPUs per node

two nodes [ 64 cores total ]

1,024 GB RAM / node

16 x 600GB 10k RPM SAS disk drives

over 640 Mflop/s

redundant 1.1 kW PSU on each

~300 Mflop/W

http://www.dell.com/us/business/p/poweredge-r910/pd


Office of the CIO


Monash Sun Grid 2011 Partnership Nodes with

Engineering

2010-11

Dell R815 - four 12-core AMD Opteron CPUs per node

five nodes [ 240 cores ]

128 GB RAM / node

10G Ethernet

> 2,400 Gflop/s

redundant 1.1 kW PSUs

~400 Mflop/W

http://www.dell.com/us/business/p/poweredge-r815/pd


Office of the CIO


Monash Sun Grid Summary

NameVintag

eNode

Core Count

Gflop/sPower Req’t

Mflop/W

MSG-I 2005 V20z 70 336 ~17 kW 20 Mflop/W

MSG-II 2006 X2100 64 332 ~11 kW 42 Mflop/W

MSG-IIIe 2007 X6220 120 624 ~7.2 kW 65 Mflop/W

MSG-IV 2008 X4600 96 885 ~3.6 kW 250 Mflop/W

MSG-III 2009 X6250 720 7200 ~23 kW 330 Mflop/W

MSG-III 2010 X6250 240 2400 ~7 kW 330 Mflop/W

MSG-gpu 2010 Dell 80> 800 +

18,660?? ??

MSG-vlm 2010Dell

R91064 > 640 ~2.2 kW 290 Mflop/W

MSG-pn 2011Dell

R815240 > 2400 ~5.5 kW 436 Mflop/WMonash Sun Grid HPC Cluster has 1,694

cores &clocks at over 12.5 (+ 18.6) Tflops with > 5.7 TB of RAM


Office of the CIO

Software Stack• S/W Development, Environments,

Libraries

• gcc, Intel C/Fortran, Intel MKL, IMSL numerical library, Ox, python, openmpi, mpich2, NetCDF, java, PETsc, FFTW, BLAS, LAPACK, gsl, mathematica, octave, matlab (limited)

• Statistics, Econometrics

• R, Gauss, Stata

• Computational Chemistry

• Gaussian 09, GaussView 5, Molden, GAMESS, Siesta, Materials Studio (Discovery), AMBER 10

• Molecular Dynamics

• NAMD, LAMMPS, Gromacs

• Underworld

• CFD Codes

• OpenFOAM, ANSYS Fluent, CFX, viper (user-installed)

• CUDA toolkit, Qt, VirtualGL, itksnap, drishti, Paraview

• CrystalSpace

• FSL

• Meep

• CircuitScape

• Structure and Beast

• XMDS

• ENViMET (via wine)

• ENVI/IDL

• etc etc etc


Office of the CIO

Growing List!

Specialist Support and AdviseInitial EngagementAccount Creation

General Advise& Startup Tutorial

RequirementsAnalysis

Follow UpMaintenance

CustomisedSolutions


Office of the CIO

Specialist Support and Advise


Office of the CIO

• Cluster Queue Configuration and Management

• Compute job preparation

• Custom scripting

• Software installation and tuning

• Job performance and/or error diagnosis

• etc

Growth of CPU Usage

2008 859K20093,300K

cpu hours

20106,863K


Office of the CIO

Growth of CPU Usage

2008 859K2009 3,300K cpu

hours2010 6,863K


Office of the CIO

783 CPU years!!!

Projected

Active Users

2008 712009 145

Active Users

2010 16924-Aug


Office of the CIO

What to expect in the future?

Continued refresh of hardware and software

decommissioning older machines

More grid nodes (CPU cores) to meet growing demand

Scalable and high-performance storage architecture without sacrificing data availability

Custom grid nodes &/or subclusters with special configurations to meet user requirements

Better integration with Grid tools and middleware


Office of the CIO

Green IT Strategy

Monash Sun Grid Beginnings

MSG-I

2005

Sun V20z AMD Opteron (dual core)

initially 32 nodes = 64 cores, with 3 new nodes added in 2007 making a total of 70 cores

4 GB RAM / node

336 Gflop/s

~17kW

20 Mflop/W

http://www.sun.com/servers/entry/v20z/index.jsp

http://www.sun.com/servers/entry/v20z/index.jsp

Monash Sun Grid MSG-II

2006

Sun X2100 AMD Opteron (dual core)

initially 24 nodes = 48 cores with 8 nodes added in 2007 making 64 cores at present

4 GB RAM / node

332 Gflop/s

~11 kW

42 Mflop/W

Picture on the right was googled and found from Jason Callaway’s Flicker page: http://www.flickr.com/photos/29925031@N07/


http://www.flickr.com/photos/29925031@N07/


Monash Sun Grid Big Mem Boxes

MSG-III (now named as MSG-IIIe)

2008

Sun X6220 Blades - two dual core AMD Opterons per node

currently 20 nodes = 80 cores with 10 nodes to be added in 2010 making 120 cores

40 GB RAM / node

624 Gflop/s

~7.2 kW

330 Mflop/W

http://www.sun.com/servers/blades/x6220/http://www.sun.com/servers/blades/x6220/datasheet.pdf



Monash Sun Grid 2010

MSG-III expansion and GPU nodes

2010

Sun X6250 - two quad-core Intel Xeon CPUs per node

240 cores

24 GB RAM / node

Dell nodes connected to two Tesla C1060 GPU cards

Ten nodes = 20 GPU cards

48 GB and 96 GB RAM configs

http://www.sun.com/servers/blades/x62520/http://www.nvidia.com/object/product_tesla_c1060_us.html



Monash Sun Grid 2009

MSG-III

2009

Sun X6250 - two quad-core Intel Xeon CPUs per node

as of 2009: 720 cores

16 GB RAM / node

> 7 Tflop/s

~23 kW

~330 Mflop/W

http://www.sun.com/servers/blades/x62520/


Monash Sun Grid Big SMP boxes

MSG-IV

2009

Sun X4600 - eight quad-core AMD Opterons CPUs per node

currently three nodes = 96 cores

96 GB RAM / node

885 Gflop/s

~3.6 kW

250 Mflop/W

http://www.sun.com/servers/blades/x4600/


Benefits of using a cluster

parallel

sequential

shared memory

distributed memory use multiple nodes

use 2, 4, 8, 32 coresa single node

multiple scenarios or cases?

use multiple coresuse tools like Nimrod

job

chara

cteri

stic


Office of the CIO

SPONGE

Introduction

Serendipitous Processing on Operating Nodes in Grid Environment (SPONGE)

Core Idea and Motivation Resource Harnessing Accessibility and Utilization

How SPONGE achieves this.

What SPONGE Can do at the Moment

What SPONGE cannot do at the moment.

Infrastructure and Usage statistics (Pretty Pictures).

Acknowledgements

Core Idea and Motivation The core idea - is to harness tremendous amount of un/under-utilized

computational power to perform high throughput computing.

Motivation - Large (Giga, Terra, Peta ??) scale computational problems that needs High throughput, generally embarrisingly parallel applications, e.g PSAs.

Latin Squares (Mathematics) – Dr. Ian Wanless and Judith Egan; Department of Mathematics.

Molecular Replacement (Biology, Chemistry) – Jason Schmidberger and Dr. Ashley Buckle; Department of Biochemistry and Molecular Biology.

Bayesian Estimation of Bandwidth in Multivariate Kernel Regression with an Unknown Error Density (Business, Economics) – Han Shang, Dr. Xibin Zhang and Dr. Maxwell King; Department of Business and Economics.

HPC Solution for Optimization of Transit Priority in Transportation Networks; Dr. Mahmoud Mesbah, Department of Civil Engineering.

Short running applications that do not require specialized software/hardware and can be easily parallelized.

Single point of submission, monitoring and control.

Core Idea and Motivation Contd…Key Focus Areas

Resource Harnessing – involves tapping “existing” (no new hardware) infrastructure that would contribute in solving the computational problem. Student Labs in different Faculties, ITS, EWS etc.. Staff Computers – Personal Contributions included.

Accessibility How to access these facilities -> Middleware. When to access these facilities -> Access and Usage

Policies.

Utilization - How to properly utilize these facilities Implementation abstraction. Single System Image. Job submission, monitoring and control.

How are we achieving this…

Caulfield Campus

Clayton Campus

Peninsula Campus

Condor Submission Node

User Submits Jobs directly to Condor Submission Node or Via Nimrod, Globus

Condor Execute Node

Condor Head Node or Central Manager

Using Condor – The goal of Condor Project us to develop, implement, deploy and evaluate mechanisms and policies that support High Throughput Computing on large collection of distributively owned computing resources.

Submission and Execution Nodes constantly updates the Central Manager

How are we achieving this…contd

Caulfield Campus

Clayton Campus

Peninsula Campus

Condor Submission Node

Sponge Works – Configuration Layer

User Submits Jobs directly to Condor Submission Node or Via Globus

Condor Execute Node

Condor Head Node Default

Configuration can be modified centrally upto node level.

•Queue Management•Resource Reservation

What SPONGE can do…

Execute large number of short running embarrassingly parallel jobs by leveraging un/under utilized existing computational resources. Sounds simple

AdvantagesLeveraging Idle CPU time that remains unused.Single point of Job Submission, Monitoring, Control

and collation of resultsRemote job submission using Nimrod/G, Globus.

What SPONGE cannot do at the moment

Sponge Pool consists

Mostly non-dedicated computers.

Distributed ownerships.

Limited availability.

This restricts execution of Jobs that:

Require Specialized Software/Hardware High Memory Large Storage Space Additional Software

Takes long time to execute (several days or weeks)

Perform Inter-Process Communication

Some StatisticsUser Name CPU Hrs Used

shikha 2012437.67

jegan 1534528.43

kylee 1166358.76

pxuser 414972.76

iwanless 371833.24

zatsepin 257631.86

hanshang 77930.72

llopes 66747.09

iwanless 30930.82

jvivian 29611.87

User Name CPU Hrs Used

jirving 13258.09

nice-user.pcha13 13205.38

wojtek 7095.26

nice-user.wojtek 6890.78

mmesbah 5562.53

transport 5379

philipc 3733.35

shahaan 3251.94

zatsepin 3069.35

kylee 2988.84

jegan 1937.55

transport 1308.44

Total 688 + CPU Years to date…

Statistics contd…

Acknowledgements

Wojtek Goscinski

Philip Chan

Jefferson Tan

35

Nimrod Tools fore-Research

Monash e-Science & Grid Engineering LaboratoryFaculty of Information Technology

36

Overview

Supporting a Software Lifecycle

Software Lifecycle Tools

The Nimrod family37

Nimrod/G

Grid Middleware

Nimrod/O Nimrod/E

Nimrod Portal

Actuators

Plan File

parameter pressure float range from 5000 to 6000 points 4parameter concent float range from 0.002 to 0.005 points 2parameter material text select anyof “Fe” “Al” task main copy compModel node:compModel copy inputFile.skel node:inputFile.skel node:substitute inputFile.skel inputFile node:execute ./compModel < inputFile > results copy node:results results.$jobnameendtask

38

From one workstation ..

39

.. Scaled Up

40

Why is this challenging?

Develop, Deploy, Test…

41

Why is this challenging?

Build, Schedule & Execute virtual application

42

Approaches to Grid programming

General Purpose WorkflowsGeneric solutionWorkflow editor Scheduler

Special purpose workflowsSolve one class of problemSpecification languageScheduler

43

Nimrod Development Cycle

Prepare Jobs using Portal

Jobs Scheduled Executed Dynamically

Sent to available machines

Results displayed &interpreted

44

Acknowledgements

Message Lab Colin Enticott Slavisa Garic Blair Bethwaite Tom Peachy Jeff Tan

MeRC Shahaan Ayyub Philip Chan

Funding & Support CRC for Enterprise Distributed Systems (DSTC) Australian Research Council GrangeNet (DCITA) Australian Research Collaboration Service (ARCS) Microsoft Sun Microsystems IBM Hewlett Packard Axceleon

Message Lab Wiki:https://messagelab.monash.edu.au/nimrod

Sponge v2

Technology

cio monash

information technology

mflopw monash

grid nodes cpu cores

gridservices divisionoffice

grid tools

cpu years

cio specialist support