Australian Virtual Observatory Pacific Rim Applications and Grid Middleware Assembly The 4th Workshop 5th-6th June 2003 Monash University David Barnes School of Physics, The University of Melbourne
Jan 15, 2016
Australian Virtual Observatory
Pacific Rim Applications and Grid Middleware Assembly
The 4th Workshop 5th-6th June 2003 Monash University
David BarnesSchool of Physics, The University of Melbourne
What is a Virtual Observatory?
• A Virtual Observatory (VO) is a distributed, uniform interface to the data archives of the world’s major astronomical observatories.
• A VO is explored with advanced data mining and visualisation tools which exploit the unified interface to enable cross-correlation and combined processing of distributed and diverse datasets.
• VOs will rely on, and provide motivation for, the development of national and international computational and data grids.
Scientific motivation
• Understanding of astrophysical processes depends on multi-wavelength observations and input from theoretical models.
• As telescopes and instruments grow in complexity, surveys generate massive databases which require increasing expertise to comprehend.
• Theoretical modeling codes are growing in sophistication to consume available compute time.
• Major advances in astrophysics will be enabled by transparently cross-matching, cross-correlating and inter-processing otherwise disparate data.
Sample multi-wavelength data for the galaxy IC5332 (Ryan-Weber)
Visible blue light - young hot stars
Infrared light - old cooler stars
H-alpha spectral line - star forming sites
HI spectral line - gas to form stars
HI velocity field - kinematics
HI velocity dispersion - gas stability, parameters
Integrated HI spectrum - total neutral gas mass, distance from redshift
And this is just the data on one object from three
Australian telescopes!
Fundamental VO Challenges• Data description: multi-wavelength, multi-
resolution, multi-dimensional, multi-domain (optical, radio, X-ray, …), world coordinate systems, limited period ownership, …
• Data provision: distributed mass storage, high-bandwidth networks, registries, …
• Data processing: high performance clusters as grid nodes, data to code versus code to data, mountains of legacy software!, …
• Interface: portals, visual data flow control, analysis tools, display tools, …
Aus-VO structure 2003
• Phase A funded AUD 260K by a 2003 ARC grant:– The University of Melbourne– The University of Sydney– CSIRO Australia Telescope National Facility– Anglo-Australian Observatory
• Additional institutes participating w/o direct funding from the ARC grant:– ANU, Mount Stromlo Observatory & APAC
– CSIRO Mathematical and Information Sciences
– University of Queensland
– VPAC & GridBus (Melb)
• Lead investigator Rachel Webster (Melb)• Project scientist David Barnes (Melb)
Aus-VO projects 2003
• Common format on-line archive projects:– HIPASS catalog: HI Parkes All Sky Survey: neutral
Hydrogen spectral line survey, ~4,300 sources with 138 parameters and 1024-channel spectra
– SUMSS catalog: Sydney University Molonglo Sky Survey: radio continuum survey at 843 MHz, >100,000 sources
– 2dFGRS QSO catalog: 2-degree Field Galaxy Redshift Survey: optical spectra of >20,000 southern quasi-stellar objects
– ATCA archive: Australia Telescope Compact Array archive: all observations since 1988, circa 1.5 TB of more than 1,000 separate observing projects! Massive exercise in describing data with metadata.
– MACHO archive: Massive Compact Halo Objects archive: 8yr lightcurves for >18M stars
Aus-VO projects 2003• Server-based visualisation tools:
– client Java canvas for legacy software package AIPS++ to draw on from a remote server (ATNF)
– grid-service implementation of distributed volume rendering - remote data transferred to remote cluster, with display canvas applet supplied by coordinating portal
• Pipelines to enable on-line reprocessing of archived raw or pre-processed telescope data:– Molonglo Observatory Synthesis Telescope
• Interfaces: beta testers for the AstroGrid consortium software
VO Interface & Portal
• Agreement with AstroGrid (UK e-Science project) to be testers for their data publication and portal creation code.
• Collecting the necessary resources and intend to have an AstroGrid-based portal serving HIPASS catalog data for demonstration at IAU General Assembly in July 2003.
• Separately testing IBM Lotus Notes and Domino Server for publication of astronomical catalogs.
Grid-based Visualisation• ATNF will build a Java
PixelCanvas so that AIPS++ visualisation applications can be deployed as Web-Service and Grid- Service Java Applets
• AIPS++ is modern, OpenSource software for reducing (radio) astronomy data, 1.6M lines of code.
Grid-based Volume Rendering• Agreement between Melbourne and AstroGrid to develop our
existing distributed-data volume rendering code into a fully-fledged Grid-Service. [see my talk at GridBus this Saturday][see my talk at GridBus this Saturday]
• Challenge is to interactively render a multi-GB cube at the IAU GA 2003, using GridFTP to transfer the data volume from a remote data warehouse to a remote rendering cluster and display and control the rendering from an applet.
Time to render 512x512 view of 1024x1024x1024 volume (seconds)
1
10
100
1000
0 10 20 30 40
number of nodes
The near future: data grids for Aus-VOThe near future: data grids for Aus-VO
• Australian archives range from ~10 GB to ~10 TB in processed (reduced) size.
• providing just the processed images and spectra on-line requires a distributed, high-bandwidth network of data servers – that is, a data grid.
• users may want some simple operations such as smoothing or filtering, applied at the data server. This is a virtual data grid.
The near future: compute grids for Aus-VOThe near future: compute grids for Aus-VO
• More complex operations may be applied requiring significant processing:– source detection and parameterisation– reprocessing of raw or intermediate data
products with new calibration algorithms– combined processing of raw, intermediate or
"final product" data from different archives
• These operations require a distributed, high-bandwidth network of computational nodes – that is, a compute grid.
2004 ARC LIEF grant2004 ARC LIEF grant
• 10 partners!
• more data archives on-line
• more tools developed with special focus on server-based visualisation
• construction of the Australian Australian Astronomy Grid…Astronomy Grid…
The Australian Astronomy Grid 2004The Australian Astronomy Grid 2004
http://www.aus-vo.org