Top Banner
Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM SDM AHM 10/07/2008 10/07/2008 Scott A. Klasky Scott A. Klasky [email protected] ANL: Ross CalTech: Cummings GT: Abbasi, Lofstead, Schwan, Wolf, Zheng LBNL: Shoshani, Sim, Wu LLNL: Kamath ORNL: Barreto, Hodson, Jin, Kora, Podhorszki NCSU: Breimyer, Mouallem, Nagappan, Samatova, Vouk NWU: Choudhary, Liao NYU: Chang, Ku PNNL: Critchlow PPPL: Ethier, Fu, Samtaney, Stotler Rutgers: Bennett, Docan, Parashar, Silver SUN: Di Utah: Kahn, Parker, Silva UCI: Lin, Xiao UCD: Ludaescher UCSD: Altintas, Crawl
32

Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky [email protected] ANL: Ross CalTech: Cummings GT: Abbasi,

Jan 16, 2016

Download

Documents

Godwin Stanley
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

SDM Integration Framework in the Hurricane of Data

SDM Integration Framework in the Hurricane of Data

SDM AHMSDM AHM10/07/200810/07/2008

Scott A. KlaskyScott A. [email protected]

ANL: RossCalTech: CummingsGT: Abbasi, Lofstead, Schwan, Wolf, ZhengLBNL: Shoshani, Sim, WuLLNL: KamathORNL: Barreto, Hodson, Jin, Kora, PodhorszkiNCSU: Breimyer, Mouallem, Nagappan, Samatova, VoukNWU: Choudhary, Liao

NYU: Chang, KuPNNL: CritchlowPPPL: Ethier, Fu, Samtaney, StotlerRutgers: Bennett, Docan, Parashar, SilverSUN: DiUtah: Kahn, Parker, SilvaUCI: Lin, XiaoUCD: LudaescherUCSD: Altintas, Crawl

Page 2: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Outline

• Current success stories and future customers of SEIF.

• The problem statement.• Vision statement.• SEIF.• ADIOS.• Workflows.• Provenance.• Security.• Dashboard.• The movie.• Vision for the future.

Page 3: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Current success stories

• Data Management/Workflow• Finally, our project benefits greatly from the expertise of the computational science team at NCCS in

extracting physics out of a large amount of simulation data, in particular, through collaborations with Dr. Klasky on visualization, data management, and work flow… HPCWire 2/2007

• Success in using workflows. Used several times with GTC and GTS groups. (Working on everyday uses).

• ADIOS• July 14 -- A team of researchers from the University of California-Irvine (UCI), in conjunction with staff

at Oak Ridge National Laboratory's National Center for Computational Sciences (NCCS), has just completed what it says is the largest run in fusion simulation history. “This huge amount of data needs fast and smooth file writing and reading," said Xiao. "With poor I/O, the file writing takes up precious computer time and the parallel file system on machines such as Jaguar can choke. With ADIOS, the I/O was vastly improved, consuming less than 3 percent of run time and allowing the researchers to write tens of terabytes of data smoothly without file system failure.“ (HPC Wire 7/2008)

• “Chimera code ran 1000x faster I/O with ADIOS on Jaguar” Messer SciDAC08.

• S3D will include ADIOS. J. Chen.

• ESMF team looking into ADIOS. C. DeLuca

• R. Harrison (ORNL) looking into ADIOS.

• XGC1 code using ADIOS everyday.

• GTS code now working with ADIOS.

Page 4: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Current success stories

• Workflow automation.

• S3D data archiving workflow moved 10TB of data from NERSC to ORNL. J. Chen

• GTC workflow automation saved valuable time during early simulations on jaguar, and simulations on seaborg.

• “From a data management perspective, the CPES project is using state-of-the-art technology and driving development of the technology in new and interesting ways. The workflows developed under this project are extremely complex and are pushing the Kepler infrastructure into the HPC arena. The work on ADIOS is a novel and exciting approach to handling high-volume I/O and could be extremely useful for a variety of scientific applications. The current dashboard technology is using existing technology, and not pushing the state of the art in web-based interfaces. Nonetheless, it is providing the CPES scientists an important capability in a useful and intuitive manner. “ CPES reviewer 4 (CPES SciDAC review) .

• There are many approaches to creating user environments for suites of computational codes, data and visualization. This project made choices that worked out well. The effort to construct and deploy the dashboard and workflow framework has been exemplary. It seems clear that the dashboard has already become the tool of choice for interacting with and managing multi-phase edge simulations. The integration of visualization and diagnostics into the dashboard meets many needs of the physicists who use the codes. The effort to capture all the dimensions of provenance is also well conceived, however the capabilities have not reached critical mass yet. Much remains to be done to enable scientists to get the most from the provenance database. This project is making effective use of the HPC resources at ORNL and NERSC. The Dashboard provides run time diagnostics and visualization of intermediate results which allows physicists to determine whether the computation is proceeding as expected. This capability has the potential to save considerable computer time by identifying runs that are going awry so they can be canceled before they would otherwise have finished. (CPES SciDAC review, reviewer #1).

Page 5: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Current success stories

• The connection to ewok.ccs.ornl.gov is unstable.

• I got error an error message like “ssh_exchange_identification: Connection closed by remote host” when I tried ‘ssh ewok’

from jaguar. This error doesn’t happen always (about 30% chance), but running kepler workflow is

interrupted by this error.

• Please check the system.

• Thank you,

• Seung-Hoe 10/3/2008

• Hi,

• I have a few things to ask you to consider.

• Currently workflow copies all *.bp files including restart.*.bp. And It seems that these restart*.bp files are

converted into hdf5 and sent to AVS (not sure. Tell me if I am wrong) . This takes really long time for large

runs. Could you make workflow not do all of these or do it after movie of other variables are generated?

• Currently, dashboard shows only movies after the simulation ends. However, generating movies takes  certain

amount of time after the simulation ends. If restart files are exist, the time is very long. If I uses –t option for a

already completed simulation, the time is about a few hours to generates all AVS graphs. So, if you can make a

button which makes dashboard think the simulation is not ended, it would be useful.

• The min/max routine doesn’t seem to catch real min/max for  very tiny numbers. Some of data has value of

~1E-15 and global plot is just zero. One example is ‘ion__total_E_flux(avg)’ of shot j38.

• When I tried workflow of restarted xgc run with the same shot number and different jobID,  txt files are

overwritten.

• When I run workflow for franklin job, txt_file is ignored. Maybe I set something wrong. Could you check txt_file

copy of franklin working OK?

• Maybe some of the features are not problems with ADIOS version of workflow. If so, let me know please.

• Thanks,

• Seung-Hoe

Page 6: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Current and Future customers

• GTC.• Monitoring workflows.

• Will want post processing/data analysis workflows.

• GTS.• Similar to GTC, but will need to get data from experimental source

(MDS+).

• XGC1• Monitoring workflows.

• XGC0-M3D code coupling.• Code coupling.

• XGC1 – GEM code coupling.

• GEM

• M3D-K / GKM

• S3D

• Chimera.

• Climate (with ESMF team). Still working out details.

Page 7: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

The problem

• DOE and NSF open science has open up the flood gates on leadership class computing.

• ANL: 550 TF (this year).

• ORNL: 1.x PF (by end of year).

• NERSC: >200TF (this year).

• It’s not about these numbers, it’s about CPU hours/year.

• In 5 years we have gone up 100x in our computing power/year.

• INCITE allows us to run >50M hours/year.

• Data on 270TF for 1 day simulation >60 TB.

• How do we get science (manage) all of this data?

Page 8: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Vision

• Problem: Managing the data from a petascale simulation, and debugging the simulation, and extracting the science involves.• Tracking

• the codes: Simulation, Analysis• the input files/parameters• the output files, from the simulation and analysis programs.• the machines and environment the codes ran on.

• Gluing all of the pieces together with workflow automation to automate the mundane tasks.

• Monitoring simulation in real-time.• Analyzing the results, and visualizing the results without

requiring users to know all of the file names, and making this happen in the same place where we can monitor the codes.

• Fast I/O which can be easily tracked.• Moving data to remote locations without babysitting data

movement.

Page 9: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Vision

• Requirements.

• Want all enabling technologies to play

well together.• Componentized approach for building pieces in SEIF.

• Components should work well by themselves, and work

inside of framework.

• Fast

• Scalable

• Easy to use!!!!

• Simplify the life of application

scientists!!!

Page 10: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Adaptable IO

General Architecture

Computations

Orchestration(Kepler) Data, DataBases

Provenance…Storage

Analytics

Control Panel(Dashboard)

& DisplayNetworking

Local/Remote… “Cloud”

NetworkingLocal/Remote… “Cloud”

Page 11: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

SEIF (SDM End-to-end Integration Framework)

• Workflow engine• Kepler

• Provenance support• Code Provenance• Data Provenance• System Provenance• Workflow Provenance

• Wide-area data movement

• SRM• SRM-lite

• Code coupling• Provenance tracking.

• Visualization• Insitu with ADIOS• Monitoring with VISIT, Express• Analysis with VisTrails

• Advanced analysis• Parallel R• Vistrails

• Adaptable I/O• Fast I/O

• Dashboard• Collaboration.

Visualization

Code Coupling

Wide-areaData Movement

DashboardDashboard

WorkflowWorkflow

Adaptable I/OAdaptable I/O

ProvenanceandMetadata

Foundation Technologies

Enabling Technologies

Approach: place highly annotated, fast, easy-to-use I/O methods in the code, which can be monitored and controlled, have a workflow engine record all of the information, visualize this on a dashboard, move desired data to user’s site, and have everything reported to a database.

Page 12: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

ADIOS Overview

• Allows plug-ins for different I/O implementations.

• Abstracts the API from the method used for I/O.

• Simple API, almost as easy as F90 write statement.

• Best practices/optimize IO routines for all supported transports “for free”

• Componentization.• Thin API• XML file

• data groupings with annotation• IO method selection• buffer sizes

• Common tools• Buffering• Scheduling

• Pluggable IO routines

ExternalMetadata(XML file)

Scientific Codes

ADIOS API

MPI-CIO

LIVE/DataTap

MPI-IO

POSIX IO

pHD

F-5

pnetCDF

Viz Engines

Others (plug-in)

buffering schedule feedback

Page 13: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

ADIOS Overview

• ADIOS is an IO componentization, which allows us to• Abstract the API from the IO implementation.• Switch from synchronous to asynchronous IO at

runtime.• Change from real-time visualization to fast IO at

runtime.• Combines.

• Fast I/O routines.• Easy to use.• Scalable architecture

(100s cores) millions of procs.• QoS.• Metadata rich output.• Visualization applied during simulations.• Analysis, compression techniques applied during

simulations.• Provenance tracking.

Page 14: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Initial ADIOS performance.

• June 7, 2008: 24 hour GTC run on Jaguar at ORNL• 93% of machine (28,672 cores)

• MPI-OpenMP mixed model on quad-core nodes (7168 MPI procs)

• three interruptions total (simple node failure) with 2 10+ hour runs

• Wrote 65 TB of data at >20 GB/sec (25 TB for post analysis)

• IO overhead ~3% of wall clock time.

• Mixed IO methods of synchronous MPI-IO and POSIX IO configured in the XML file

DART: <2% overhead forwriting 2 TB/hour withXGC code.

DataTap vs. Posix– 1 file per process (Posix).

– 5 secs for GTCcomputation.

– ~25 seconds for Posix IO

– ~4 seconds with DataTap

Page 15: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Chimera IO Performance

2x scaling

• Plot minimum value from 5 runs with 9 restarts/run• Error bars show maximum time for the method.

Page 16: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

ADIOS challenges

• Faster reading.

• Faster writing on new petascale/exascale

machines.

• Work with file system experts to refine our file

format for ‘optimal’ performance.

• More characteristics in the file by working with

analysis and application experts.

• Index files better using FASTBIT.

• In situ visualization methods.

Page 17: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Controlling metadata in simulations.

• Problem: Codes are producing large amounts of

• Files

• Data in files.

• Information in data.

• Need to keep track of this data for extracting

the science from the simulations.

• Workflows need to keep track of file locations,

what information is in the files, etc.

• Makes it easier to develop generic(template) workflows

if we can ‘gain’ this information inside of Kepler.

• Solution:

• Our solution is to provide a link from ADIOS into the

provenance (Kepler,….).

Page 18: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Current Workflows.

• Pre-production. (Uston)• Starting to work on this.

• Production workflows. (Norbert)• Currently our most active area of use for

SEIF.

• Analysis Workflows. (Ayla,

Norbert)• Current area of interest.

• Need to work with 3D graphics + parallel

data analysis.

Page 19: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Monitoring a simulation + archiving (XGC)

• NetCDF files Transfer files to e2e system on-the-fly

Generate plots using grace library

Archive NetCDF files at the end of simulation

• Binary files (BP, ADIOS output) Transfer to e2e system using bbcp

Convert to HDF5 format

Start up AVS/Express service

Generate images with AVS/Express

Archive HDF5 files in large chunks to HPSS

• Generate movies from the images

Page 20: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Coupling Fusion codes for Full ELM, multi-cycles

• Run XGC until unstable conditions

• M3D coupling data from XGC• Transfer to end-to-end system

• Execute M3D: compute newequilibrium

• Transfer back the new equilibrium to XGC

• Execute ELITE: compute growth rate andtest linear stability

• Execute M3D-MPP: to study unstable states (ELM crash)

• Restart XGC with new equilibriumfrom M3D-MPP

Page 21: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

External tools orchestrated by the workflow

• C codes– Transfer files between hosts

(bbcp, scp)– BP to HDF5 conversion tool– NetCDF split and merge tools– NetCDF 1D variable to 1D plot

(xmgrace)

• Bash shell scripts– Variables in NetCDF to plots– Implement single complete

steps in coupling• M3D-OMP run, ELITE run +

stability decision + images, M3D-MPP preparation and run

• Python scripts– Variables in HDF5 to images

– Info generation for the

dashboard

• AVS/Express– HDF5 2D variable to 2D

image

• Other viz tools– gnuplot

– IDL

Page 22: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Provenance + Data Movement

• Process provenance• the steps performed in the workflow, the

progress through the workflow control flow, etc.

• Data provenance• history and lineage of each data item

associated with the actual simulation (inputs, outputs, intermediate states, etc.)

• Workflow provenance• history of the workflow evolution and

structure

• System provenance• Machine and environment information• compilation history of the codes• information about the libraries• source code• run-time environment settings

DiskCache

SSH Server

Remote (user’s) site

SSHRequest

GridFTP/FTP/SCP

transfersDiskCache

SRM-lite

srmlite.xml

Local Commands

Dashboardsite

• Data Movement• Given a OTP firewall at one site,

where local files resides• Need a client program that “pushes”

file to users, which …• Automates movement of multiple files

• Concurrent transfers: utilize B/W• Using various transfer protocols • Support entire directory transfers• Recover from mid-transfer interruptions

• Can be invoked from Dashboard• Show what files are to be transferred• Provide asynchronous service – user

can logout of Dashboard• Have a way to shows transfer progress

asynchronously• Support monitoring from anywhere

Page 23: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Security Problem with OTP

• Currently some leadership class

machines would not let certificates.

• Work with ORNL to support this

functionality.• Enable long running workflows.

• Enable job submission through dashboard.

• Enable hpss retrieval for data analysis workflows

via the dashboard.

Page 24: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Ewok-web

Dashboard(apache user)

Dashboard(apache user)

PASSCODEPASSCODE

MyProxyServer

Grid certGrid cert

proxy certproxy cert

Ewok

Jaguar

Login node Compute node

Login node Compute node

GSISSHGSISSH Keplerworkflow

Keplerworkflow

PBSjob

GSISSHGSISSH

SimulationSimulationPBSjob

NCCS

Security

Page 25: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Ewok-web

Dashboard(apache user)

Dashboard(apache user)

PASSCODEPASSCODE

MyProxyServer

Grid certGrid cert

proxy certproxy cert

Ewok

Jaguar

Login node Compute node

Login node Compute node

GSISSHGSISSHWorkflowWorkflowPBS

job

GSISSHGSISSH

SimulationSimulationPBSjob

NCCS

GRAMGRAM

GRAMGRAM

GRAM will be added in second phase to support workflow systems other than Kepler and to allow Kepler to use the “standard” way for accessing resources in Grids

Page 26: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Workflow and other challenges

• Make workflows easier to build and debug!

• Finish the link from ADIOS to Kepler.

• Create 1 workflow that works with S3D, GTC, GTS,

XGC1 for code monitoring.

• No changes in Kepler for these codes!

• Analysis workflows integrated with advanced

visualization and data analysis.

• Queries of simulations.

• We want to query data from multiple simulations.

Page 27: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Dashboard

Page 28: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Machine monitoring.

• Allow for secure logins with OTP.

• Allow for job submission.

• Allow for killing jobs.

• Search old jobs.• See collaborators

jobs.

Page 29: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Dashboard challenges

• Run simulations through the dashboard.

• Allow interaction with data from HPSS.

• Run advanced analysis and visualization on the

dashboard.

• Access to more plug-ins for data analysis.

• 3D visualization.

• More interactive 2D visualization.

• Query multiple simulations/experimental data

through the dashboard for comparative analysis.

• Collaboration.

Page 30: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

SEIF Movie

Page 31: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Vision for the future.

• Tiger teams• Work on 1 code with several

experts in all areas of DM.

• Replace their IO with high

performance IO

• Integrate analysis routines in

their analysis workflows.

• Create monitoring and analysis

workflows.

• Track codes .

• Integrate into SEIF.

• 1 code at a time, 4 months per

code.

• Rest of team works on core

technologies.

workflow

Dashboard

IO

analysis

Team leader

provenance

Page 32: Scott Klasky SDM Integration Framework in the Hurricane of Data SDM AHM 10/07/2008 Scott A. Klasky klasky@ornl.gov ANL: Ross CalTech: Cummings GT: Abbasi,

Scott Klasky

Long term approach for SDM

• Grow the core technologies for eXascale computing.• Grow the core by working with more applications.

• Don’t build infrastructure if we can’t see ‘core applications’ benefitting from this after 1.5 years of development. (1 at a time).

• Team work! • Create our team of R&D which work with codes.

• Build mature tools.• Better software testing before we release our software.

• Need framework to live with just a few support people.

• Need to componetize everything together.• Allow separate pieces to live without SEIF.

• Make sure software scales to yottabytes!• But first make it work on MB’s.

• Need to look into better searching of data across multiple simulations.