Top Banner
1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek Simmel, PSC; John Towns, NCSA; Nancy Wilkins-Diehr, SDSC
59

1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

Mar 27, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

1

Astronomy Applications in the TeraGrid Environment

Roy Williams, Caltech

with thanks for material to:Sandra Bittner, ANL;

Sharon Brunett, Caltech; Derek Simmel, PSC; John Towns, NCSA;

Nancy Wilkins-Diehr, SDSC

Page 2: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

The TeraGrid VisionDistributing the resources is better than putting them at one site

Build new, extensible, grid-based infrastructure to support grid-enabled scientific applications

New hardware, new networks, new software, new practices, new policies

Expand centers to support cyberinfrastructure Distributed, coordinated operations center Exploit unique partner expertise and resources to make whole

greater than the sum of its parts Leverage homogeneity to make the distributed

computing easier and simplify initial development and standardization

Run single job across entire TeraGrid Move executables between sites

Page 3: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

What is Grid Really?

A set of powerful Beowulf clusters Lots of disk storage Fast interconnection Unified account management Interesting software

The Grid is not Magic Infinite Simple A universal panacea The hype that you have read

Page 4: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Grid as Federation

Teragrid as a federation

independent centers

flexibility

unified interface

power and strength

Large/small state compromise

Page 5: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

TeraGrid Wide Area Network

Page 6: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

6

Grid Astronomy

Page 7: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Quasar ScienceAn NVO-Teragrid projectPennState, CMU, Caltech

• 60,000 quasar spectra from Sloan Sky Survey• Each is 1 cpu-hour: submit to grid queue• Fits complex model (173 parameter)

derive black hole mass from line widths

clusters

globusrun

manager

NVO dataservices

Page 8: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

N-point galaxy correlationAn NVO-Teragrid projectPitt, CMU

Finding triple correlation in 3D SDSS galaxy catalog (RA/Dec/z)

Lots of large parallel jobs

kd-tree algorithms

Page 9: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Palomar-Quest SurveyCaltech, NCSA, Yale

P48 Telescope

Caltech Yale

NCSA

Transient pipeline computing reservation at sunrise for immediate followup of transients

Synoptic survey massive resampling (Atlasmaker) for ultrafaint detection

TG

NCSA and Caltech and Yale run different pipelines on the same data

50 Gbyte/night

5 Tbyte

ALERT

Page 10: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Transient from PQfrom catalog pipeline

Page 11: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

PQ stacked imagesfrom image pipeline

Page 12: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Wide-area Mosaicking (Hyperatlas)An NVO-Teragrid projectCaltech

High-qualityflux-preserving, spatial accuracy

StackableHyperatlas

Edge-freePyramid weight

Mining AND Outreach

DPOSS 15º

Griffith Observatory "Big Picture"

Page 13: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

2MASS Mosaicking portalAn NVO-Teragrid projectCaltech IPAC

Page 14: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

14

Teragrid hardware

Page 15: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

TeraGrid Components

Compute hardware Intel/Linux Clusters, Alpha SMP clusters,

POWER4 cluster, … Large-scale storage systems

hundreds of terabytes for secondary storage Very high-speed network backbone

bandwidth for rich interaction and tight coupling Grid middleware

Globus, data management, … Next-generation applications

Page 16: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Overview of Distributed TeraGrid Resources

HPSSHPSS

HPSS UniTree

External Networks

External NetworksExternal

Networks

External Networks

Site Resources Site Resources

Site ResourcesSite ResourcesNCSA/PACI10.3 TF240 TB

SDSC4.1 TF225 TB

Caltech Argonne

Page 17: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Compute Resources – NCSA2.6 TF ~10.6 TF w/ 230 TB

GbE FabricGbE Fabric

Myrinet Fabric

2p 1.3 GHz4 or 12 GB memory

73 GB scratch

Brocade 12000 Switches

256 2x FC

2.6 TF Madison256 nodes

2p Madison4 GB memory

2x73 GB

2p Madison4 GB memory

2x73 GB

8 TF Madison 667 nodes

Storage I/Oover Myrinet and/or GbE

230 TB Interactive+Spare Nodes

Login, FTP

8 4pMadisonNodes

30 Gbps to TeraGrid Network

2p Madison4 GB memory

2x73 GB

92 2x FC

250MB/s/node * 670 nodes250MB/s/node * 256 nodes

Page 18: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Compute Resources – SDSC 1.3 TF ~4.3 + 1.1 TF w/ 500 TB

GbE FabricGbE Fabric

Myrinet Fabric

2p 1.3 GHz4 GB

memory73 GB scratch

Brocade 12000 Switches128 2x FC

1.3 TF Madison128 nodes

2p Madison4 GB memory

2x73 GB

2p Madison4 GB memory

2x73 GB

500 TB

Login, FTP

30 Gbps to TeraGrid Network

256 2x FC

128 2x FC128 2x FC

128 250MB/s 128 250MB/s 128 250MB/s

3 TF Madison 256 nodes

Interactive+Spare Nodes6 4p

MadisonNodes

Page 19: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Compute Resources – Caltech~ 100 GF w/ 100 TB

GbE FabricGbE Fabric

Myrinet Fabric

2p Madison6 GB memory73 GB scratch

34 GF Madison17 HP/Intel nodes

2p Madison6 GB memory

2x73 GB

13 Tape drives1.2 PB silo raw capacity

Login, FTP

30 Gbps to TeraGrid Network

13 2xFC

36 250MB/s

72 GF Madison 36 IBM/Intel nodes

Interactive Node

17 250MB/s

2p IBMMadisonNode

4p Opteron8 GB memory 66 TB RAID5

HPSS Datawulf

6 Opteron nodes

2p ia326 GB memory100 TB /pvfs

33 IA32 storage nodes 100 TB /pvfs

33 250MB/s

2p Madison6 GB memory73 GB scratch

Page 20: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

20

Using Teragrid

Page 21: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Wide Variety of Usage Scenarios

Tightly coupled jobs storing vast amounts of data, performing visualization remotely as well as making data available through online collections (ENZO)

Thousands of independent jobs using data from a distributed data collection (NVO)

Science Gateways – "not a Unix prompt"! from web browser with security from application eg IRAF, IDL

Page 22: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Traditional Parallel Processing

Single executables to be on a single remote machine big assumptions

runtime necessities (e.g. executables, input files, shared objects) available on remote system!

login to a head node, choose a submission mechanism

Direct, interactive execution mpirun –np 16 ./a.out

Through a batch job manager qsub my_script

where my_script describes executable location, runtime duration, redirection of stdout/err, mpirun specification…

Page 23: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Traditional Parallel Processing II

Through globus globusrun -r [some-teragrid-head-node].teragrid.org/jobmanager -f my_rsl_script

where my_rsl_script describes the same details as in the qsub my_script!

Through Condor-G condor_submit my_condor_script

where my_condor_script describes the same details as the globus my_rsl_script!

Page 24: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Distributed Parallel Processing

Decompose application over geographically distributed resources functional or domain decomposition fits well take advantage of load balancing

opportunities think about latency impact

Improved utilization of a many resources

Flexible job management

Page 25: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Pipelined/dataflow processing

Suited for problems which can be divided into a series of sequential tasks where multiple instances of problem need

executing series of data needs processing with

multiple operations on each series information from one processing phase can

be passed to next phase before current phase is complete

Page 26: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Security

ssh with password Too much password-typing Not very secure-- big break-in at TG April 04

One failure is a big failure all TG!

Caltech and Argonne no longer allow this SDSC does not allow password change

Page 27: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Security

ssh with public key: single sign-on! use ssh-keygen on Unix or puttykeygen on Windows

public key file (eg id_rsa.pub) AND private key file (eg id_rsa) AND passphrase

on remote machine, put public ke .ssh/authorized_keys

on local machine, combine private key and passphrase ATM card model

On TG, can put public key on application form immediate login, no snailmail

Page 28: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Security

X.509 certificates: single sign-on! from a Certificate Authority (eg verisign, US navy, DOE, etc etc)It is:

Distinguished Name (DN) AND /C=US/O=National Center for Supercomputing Applications/CN=Roy Williams

Private file (usercert.p12) AND passphrase

Remote machine needs entry in gridmap file (maps DN to account)

use gx-map command Can create certificate with ncsa-cert-request etc Certificates can be lodged in web browser

Page 29: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

3 Ways to Submit a Job

1. Directly to PBS Batch Scheduler Simple, scripts are portable among PBS TeraGrid clusters

2. Globus common batch script syntax Scripts are portable among other grids using Globus

3. Condor-G Nice interface atop Globus, monitoring of all jobs submitted via Condor-G Higher-level tools like DAGMan

Page 30: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

PBS Batch Submission

ssh tg-login.[caltech|ncsa|sdsc|uc].teragrid.org qsub flatten.sh –v "FILE=f544" qstat or showq ls *.dat pbs.out, pbs.err files

Page 31: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

globus-job-submit

For running of batch/offline jobs globus-job-submit Submit job

same interface as globus-job-run returns immediately

globus-job-status Check job status globus-job-cancel Cancel job globus-job-get-output Get job

stdout/err globus-job-clean Cleanup after job

Page 32: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Condor-G Job Submission

tg-login.sdsc.teragrid.org

PBS

Globus job manager

mickey.disney.edu

Globus API

Condor-G

executable=/wd/doituniverse=globusglobusscheduler=<…>globusrsl=(maxtime=10)queue

executable=/wd/doituniverse=globusglobusscheduler=<…>globusrsl=(maxtime=10)queue

Page 33: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Condor-G

Combines the strengths of Condor and the Globus Toolkit Advantages when managing grid jobs

full featured queuing service credential management fault-tolerance DAGman (== pipelines)

Page 34: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Condor DAGMan

Manages workflow interdependencies Each task is a Condor description file A DAG file controls the order in which

the Condor files are run

Page 35: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Where’s the disk

Home directory $TG_CLUSTER_HOME

example /home/roy

Shared writeable global areas $TG_CLUSTER_PFS

example /pvfs/MCA04N009/roy

Page 36: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

GridFtp

Moving a Test File

% globus-url-copy "`grid-cert-info -subject`" \ gsiftp://localhost:5678/tmp/file1 \ file:///tmp/file2

Also uberftp and scp

Page 37: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Storage Resource Broker (SRB)

Single logical namespace while accessing distributed archival storage resources

Effectively infinite storage (first to 1TB wins a t-shirt)

Data replication Parallel Transfers Interfaces: command-line, API,

web/portal.

Page 38: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Storage Resource Broker (SRB):Virtual Resources, Replication

NCSA

SDSC

workstation

SRB Client (cmdline,

or API)

hpss-sdsc

sfs-tape-sdsc

hpss-caltech

Page 39: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Allocations Policies

TG resources allocated via the PACI allocations and review process

modeled after NSF process TG considered as single resource for grid allocations

Different levels of review for different size allocation requests

DAC: up to 10,000 PRAC/AAB: <200,000 SUs/year NRAC: 200,000+ SUs/year

Policies/procedures posted at:http://www.paci.org/Allocations.html

Proposal submission through the PACI On-Line Proposal System (POPS)

https://pops-submit.paci.org/

minimal review, fast turnaround

Page 40: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Requesting a TeraGrid Allocation

htt

p:/

/ww

w.p

aci

.org

Page 41: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

24/7 Consulting Support

[email protected] advanced ticketing system for cross-site support staffed 24/7 866-336-2357, 9-5 Pacific Time

http://news.teragrid.org/ Extensive experience solving problems for

early access users Networking, compute resources, extensible

TeraGrid resources

Page 42: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Links

www.teragrid.org/userinfo getting an account [email protected] news.teragrid.org site monitors

Page 43: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

43

DemoData intensive computing with NVO services

Page 44: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

DPOSS flattening

2650 x 1.1 Gbyte files

Cropping borders

Quadratic fit and subtract

Virtual data

Source Target

Page 45: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Driving the Queues for f in os.listdir(inputDirectory):

# if the file exists, with the right size and age, then we keep it

ofile = outputDirectory +"/"+ f

if os.path.exists(ofile):

osize = os.path.getsize(ofile)

if osize != 1109404800:

print " -- wrong target size, remaking", osize

else:

time_tgt = filetime(ofile)

time_src = filetime(file)

if time_tgt < time_src:

print(" -- target too old or nonexistant, making")

else:

print " -- already have target file "

continue

cmd = "qsub flat.sh -v \"FILE=" + f +"\""

print " -- submitting batch job: ", cmd

os.system(cmd)

Here is the driver that makes and submits jobs

Page 46: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

PBS script

#!/bin/sh

#PBS -N dposs

#PBS -V

#PBS -l nodes=1

#PBS -l walltime=1:00:00

cd /home/roy/dposs-flat/flat

./flat \

-infile /pvfs/mydata/source/${FILE}.fits \

-outfile /pvfs/mydata/target/${FILE}.fits \

-chop 0 0 1500 23552 \

-chop 0 0 23552 1500 \

-chop 0 22052 23552 23552 \

-chop 22052 0 23552 23552 \

-chop 18052 0 23552 4000

A PBS script. Can do "qsub script.sh –v "FILE=f345"

Page 47: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Atlasmakera service-oriented applicationon Teragrid

VO Registry

SIAP

Hyperatlas

Federated Images:wavelength, time, ...

source detectionaverage/max

subtraction

Page 48: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

HyperatlasStandard naming for atlases and pages

TM-5-SIN-20Page 1589

Standard Scales:scale s means 220-s arcseconds per pixel

SIN projection

TAN projection

TM-5 layout

HV-4 layout

Standard Projections

StandardLayout

Page 49: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Hyperatlas is a ServiceAll Pages: <baseURL>/getChart?atlas=TM-5-SIN-200 2.77777778E-4 'RA---SIN’ 'DEC--SIN' 0.0 -90.01 2.77777778E-4 'RA---SIN‘ 'DEC--SIN' 0.0 -85.02 2.77777778E-4 'RA---SIN‘ 'DEC--SIN' 36.0 -85.0...1731 2.77777778E-4 'RA---SIN‘ 'DEC--SIN' 288.0 85.01732 2.77777778E-4 'RA---SIN‘ 'DEC--SIN' 324.0 85.01733 2.77777778E-4 'RA---SIN‘ 'DEC--SIN' 0.0 90.0

Best Page: <baseURL>/getChart?atlas=TM-5-SIN-20&RA=182&Dec=62

1604 2.77777778E-4 'RA---SIN‘ 'DEC--SIN' 184.61538 60.0

Numbered Page: <baseURL>/getChart?atlas=TM-5-SIN-20&page=1604

1604 2.77777778E-4 'RA---SIN' 'DEC--SIN' 184.61538 60.0

Replicated ImplementationsbaseURL = http://mercury.cacr.caltech.edu:8080/hyperatlas (try services)baseURL = http://virtualsky.org/servlet

Page 50: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

GET services from Python

hyperatlasURL = self.hyperatlasServer + "/getChart?atlas=" + atlas \

+ "&RA=" + str(center1) + "&Dec=" + str(center2)

stream = urllib.urlopen(hyperatlasURL)

# result is a tab-separated line, so use split() to tokenize

tokens = stream.readline().split('\t')

print "Using page ", tokens[0], " of atlas ", atlas

self.scale = float(tokens[1])

self.CTYPE1 = tokens[2]

self.CTYPE2 = tokens[3]

rval1 = float(tokens[4])

rval2 = float(tokens[5])

This code uses a service to find the best hyperatlas page for a given sky location

Page 51: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

VOTable parser in Python

stream = urllib.urlopen(SIAP_URL)

doc = xml.dom.minidom.parse(stream)

#Make a dictionary for the columns

col_ucd_dict = {}

for XML_TABLE in doc.getElementsByTagName("TABLE"):

for XML_FIELD in XML_TABLE.getElementsByTagName("FIELD"):

col_ucd = XML_FIELD.getAttribute("ucd")

col_ucd_dict[col_title] = col_counter

urlColumn = col_ucd_dict["VOX:Image_AccessReference"]

formatColumn = col_ucd_dict["VOX:Image_Format"]

raColumn = col_ucd_dict["POS_EQ_RA_MAIN"]

deColumn = col_ucd_dict["POS_EQ_DEC_MAIN"]

From a SIAP URL, we get the XML, and extract the columns that have the image references, image format, and image RA/Dec

(need exception catching here)

Page 52: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

VOTable parser in Python

table=[]

for XML_TABLE in doc.getElementsByTagName("TABLE"):

for XML_DATA in XML_TABLE.getElementsByTagName("DATA"):

for XML_TABLEDATA in XML_DATA.getElementsByTagName("TABLEDATA"):

for XML_TR in XML_TABLEDATA.getElementsByTagName("TR"):

row=[]

for XML_TD in XML_TR.getElementsByTagName("TD"):

data = ""

for child in XML_TD.childNodes:

data += child.data

row.append(data)

table.append(row)

Table is a list of rows, and each row is a list of table cells

Page 53: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

SOAP client in Python

from SOAPpy import *

# get fitsheader string as FITS header

# get x1, x2 as coordinates on image

server = SOAPProxy("http://mercury.cacr.caltech.edu:9091")

wcsR = server.xy2sky(fitsheader, x1, x2)

ra = wcsR["c1"]

dec = wcsR["c2"]

status = wcsR["status"]

message = wcsR["message"]

print "Sky coordinates are:", ra, dec

print "status is: ", status

print "Message is: ", message

WCSTools (xy2sky and sky2xy) as web services

Page 54: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Future: Science Gateways

Page 55: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Teragrid Impediments

Learn GlobusLearn MPILearn PBSPort code to ItaniumGet certificateGet logged inWait 3 months for accountWrite proposal

and now do some science....

Page 56: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

A better way:Graduated Securityfor Science Gateways

Web form - anonymous

somescience....

Register - logging and reporting

morescience....

Authenticate X.509- browser or cmd line

big-ironcomputing

....

Write proposal- own account

power user

Page 57: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Secure Web servicesfor Teragrid Access

web form(browser hascertificate)

auto-generated client APIfor scripted submission(certificate in .globus/)

ClarensBOSSPBSGridportXforms

distribute jobs on grid

Embedded in existingclient application (Root, IRAF, IDL, ...)

Embedded as part of other service(proxy agent)

Page 58: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Shell command

List files, get files

Submit job to TG queue (Condor / Dagman / globusrun)

Monitor running jobs

Secure Web servicesfor Teragrid Access

Page 59: 1 Astronomy Applications in the TeraGrid Environment Roy Williams, Caltech with thanks for material to: Sandra Bittner, ANL; Sharon Brunett, Caltech; Derek.

NVO Summer School Sept 2004

Teragrid Wants YOU!

Your astronomy applications Your science gateway projects Teragrid has 100's of processors and

100's of terabytes