Top Banner
Tools at Scale - Requirements and Experience Mary Zosel, LLNL ASCI / PSE ASCI Simulation Development Environment Tools Project Prepared for SciComp 2000 La Jolla, Ca. Aug 14-16, 2000 UCRL: VG - 139702
16

Tools at Scale - Requirements and Experience Mary Zosel, LLNL ASCI / PSE ASCI Simulation Development Environment Tools Project Prepared for SciComp 2000.

Jan 05, 2016

Download

Documents

Phillip Griffin
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Tools at Scale - Requirements and Experience Mary Zosel, LLNL ASCI / PSE ASCI Simulation Development Environment Tools Project Prepared for SciComp 2000.

Tools at Scale - Requirements and Experience

Mary Zosel, LLNL

ASCI / PSE

ASCI Simulation Development Environment

Tools Project

Prepared for SciComp 2000

La Jolla, Ca.

Aug 14-16, 2000

UCRL: VG - 139702

Page 2: Tools at Scale - Requirements and Experience Mary Zosel, LLNL ASCI / PSE ASCI Simulation Development Environment Tools Project Prepared for SciComp 2000.

Presentation Outline:

Overview of Systems

Requirements for Scale

Experience/Progress in debugging and tuning

Page 3: Tools at Scale - Requirements and Experience Mary Zosel, LLNL ASCI / PSE ASCI Simulation Development Environment Tools Project Prepared for SciComp 2000.

ASCI WHITE• 8192 P3 cpu’s• NightHawk 2 nodes• Colony Switch• 12.3 TF peak• 160 TB disk• 28 tractor trailers• Classified Network

Full system at IBM

120 nodes in new home atLLNL - remainder due late Aug.

Page 4: Tools at Scale - Requirements and Experience Mary Zosel, LLNL ASCI / PSE ASCI Simulation Development Environment Tools Project Prepared for SciComp 2000.

White joins these IBM platforms at LLNL

• 128 cpu - SNOW - (8-way P3 NH 1 nodes - Colony)– Experimental software development platform - Unclassified

• 1344 cpu - BLUE - (4-way 604e silver nodes / TB3MX)– Production unclassified platform

• 16 cpu - BABY - (4-way 604e silver nodes / TB3MX)– Experimental development platform - first stop for new system software

• 64 cpu - ER - (4 way 604e silver nodes / TB3MX)– Backup production system “parts” - and experimental software

• 5856 cpu - SKY (3 sectors of 488 silver nodes - connected with TB3MX and 6 HPGN IP routers) - Classified production system.

• When White is complete - ~2/3 of SKY will become the unclassified production system

Page 5: Tools at Scale - Requirements and Experience Mary Zosel, LLNL ASCI / PSE ASCI Simulation Development Environment Tools Project Prepared for SciComp 2000.

Why the big machines?

• The purpose of ASCI is new 3-D codes for use in place of testing for Stockpile Certification.

• ASCI program plan calls for series of application milepost demonstrations of increasingly complex calculations which require the very large platforms.– Last year- 1000 cpu requirement

– This year - 1500 cpu requirement

– Next year - ~4000 cpu requirement

• Tri-lab resource -> multiple code teams with large scale requirements

Page 6: Tools at Scale - Requirements and Experience Mary Zosel, LLNL ASCI / PSE ASCI Simulation Development Environment Tools Project Prepared for SciComp 2000.

What does this imply for development environment?Pressure Stress Pressure

• Deadlines: multiple code teams working against time

• Long Calculations: need to understand and optimize time requirements of each component to plan for production runs

• Large Scale: easy to push past the knee of scalability - and past the Troutbeck US limit of 1024 tasks

• Large Memory: n**2 buffer management schemes hurt • Access Contention: not easy to get large test runs -

especially for tool work

Page 7: Tools at Scale - Requirements and Experience Mary Zosel, LLNL ASCI / PSE ASCI Simulation Development Environment Tools Project Prepared for SciComp 2000.

What Tools are in use?Staying with standards helps make tools usable

• Languages/Compilers: – C, C++, Fortran from both IBM and KAI

• Runtime: OpenMP and MPI– Production codes not using pvm, shmem, direct LAPI use, etc. and direct

use of pthreads is very limited

• Debugging / Tuning:– TotalView, LCF, Great Circle, ZeroFault, Guide, Vampir, xprofiler,

pmapi / papi, and hopefully new IBM tools

Page 8: Tools at Scale - Requirements and Experience Mary Zosel, LLNL ASCI / PSE ASCI Simulation Development Environment Tools Project Prepared for SciComp 2000.

Debugging --- LLNL Experience• Users DO want to use the debugger with large # cpus• There have been lots of frustrations - but there is progress and

expectation of further improvements– Slow to attach / start … what was hours is now minutes– Experience / education helps avoid some problems ...

• Need large memory settings in ld• Now have MP_SYNC_ON_CONNECT off by default• Set startup timeouts (MP_TIMEOUT)

– “Sluggish but tolerable” describes a recent 512 cpu session

• Local feature development aimed at scale ... – Subsetting, collapse, shortcuts, filtering, … both CLI and X versions

• Etnus continuing to address scalability

Page 9: Tools at Scale - Requirements and Experience Mary Zosel, LLNL ASCI / PSE ASCI Simulation Development Environment Tools Project Prepared for SciComp 2000.

New Attach Option to get subset of tasks

Page 10: Tools at Scale - Requirements and Experience Mary Zosel, LLNL ASCI / PSE ASCI Simulation Development Environment Tools Project Prepared for SciComp 2000.

Root window collapsed Shows task 4 in different

state.

Same Root window opened to show all tasks

Page 11: Tools at Scale - Requirements and Experience Mary Zosel, LLNL ASCI / PSE ASCI Simulation Development Environment Tools Project Prepared for SciComp 2000.

Example of thumb-screw on msg window

Cycle thru message state

Page 12: Tools at Scale - Requirements and Experience Mary Zosel, LLNL ASCI / PSE ASCI Simulation Development Environment Tools Project Prepared for SciComp 2000.

Performance … status quo is less promising

• MPI scale is an issue - OpenMP reduces problem

• Understanding thread performance is issue

• Users DO want to use the tools - this is new– They need estimates for their large code runs …

• Is my job is running or hung?

• Tools aren’t yet ready for scale -

including size-of-code scaling

• Several tools do not support threads

• Problems often not in the user’s code

Page 13: Tools at Scale - Requirements and Experience Mary Zosel, LLNL ASCI / PSE ASCI Simulation Development Environment Tools Project Prepared for SciComp 2000.

List of sample problems User observes that …

• … as the number of tasks grows, the code becomes relatively slower and slower. The sum of the CPU time and the system time doesn't add up to wall-clock time – and this missing time is the component growing the fastest. [Diagnosis – bad adaptor software configuration was causing excessive fragmentation and retransmission of MPI messages]

• … unexplained code slow-down from previous runs and nothing in the code has changed. [Diagnosis – orphaned processes on one node slowed down entire code,]

• … threaded version of code much slower than straight MPI. [Diagnosis – code had many small malloc calls and was serializing through the malloc code.]

• … certain part of code takes 10 seconds to run while the problem is small – and then after a call to a memory-intensive routine – the same portion of code takes 18 seconds to run. [Diagnosis – not sure – but believed to be memory heap fragmentation causing paging.]

• … job runs faster on Blue (604e system) than it does on Snow (P3 system). [Diagnosis – not yet known – wonder about flow-control default setting].

• … a non-blocking message-test code is taking up to 15 times longer to run on Snow than it does on Blue. [Diagnosis - not yet known - flow control setting doesn’t help.]

Page 14: Tools at Scale - Requirements and Experience Mary Zosel, LLNL ASCI / PSE ASCI Simulation Development Environment Tools Project Prepared for SciComp 2000.

What are we doing about this?• PathForward contracts: KAI/Pallas, Etnus, MSTI

• Infrastructure development: to facilitate new tools / probes – supports click-back to source– currently QT on DPCL … future???

• Probe components: -memory usage, mpi classification

• Lightweight CoreFile … and Performance Monitors

• External observation … Monitor, PS, VMSTAT …

• Testing new IBM beta tools

• Sys admins starting performance regression database

Page 15: Tools at Scale - Requirements and Experience Mary Zosel, LLNL ASCI / PSE ASCI Simulation Development Environment Tools Project Prepared for SciComp 2000.

4 8 16 32 64 128 256.00

25,000,000.00

50,000,000.00

75,000,000.00

100,000,000.00

125,000,000.00

150,000,000.00

175,000,000.00

User code

Wait

Send

Irecv

Init

Comm_size

Comm_rank

Bcast

Barrier

Allreduce

Number of Processors

Microseconds

Tool Work In Progress

Page 16: Tools at Scale - Requirements and Experience Mary Zosel, LLNL ASCI / PSE ASCI Simulation Development Environment Tools Project Prepared for SciComp 2000.

the faster I go, the behinder I get

… we ARE making progress, but the problems are getting harder and coming in faster ...

It’s a Team EffortRich Zwakenberg - debuggingKaren WarrenBor ChanJohn May - performance toolsJeff VetterJohn GyllenhaalChris ChambreauMike McCrackenJohn Engle - compiler supportLinda Stanberry - mpi relatedBronis deSupinskiSusan Post - system testingBrian Carnes - general Mary ZoselScott Taylor - emeritasJohn Ranelletti