JetStream OPeNDAP Geoscience Data Web Delivery
JetStream OPeNDAP Geoscience Data Web
Delivery
• JetStream Overview• Tour of Some Existing sites
• Data Management Issues
• Downloads, compression, format, …
• System Administration / Back Office
• Thick Client support
• Summary – JetStream Advantages
Topics
Web Data Delivery
Two key components …
• Data Provider - with data to serve
• Client – using standard web-pages and standard file downloading protocols
Goal: Easy access by all / Reliable data transfer
JetStream – the Intelligent Data Server
Provider’s Data User’s Software (and metadata)
HTTP
JetStream
BrowserServer
Web pages
Downloads
User Input
Client Side 1: Standard Web Browser
Client Side 2: Simple Inputs
Client Side 3. Standard File Download!
Browser
Jet-Stream
1. Prepare Data
HTTPServer
OPeNDAP
JetStream Summary
Cat
Jet-Stream
2. Serve WebPages / Client Access
HTTPServer
OPeNDAP
Cat
Browser
JetStream Summary
Web pages
User Input
3. Process Request
HTTP Browser
JetStream Summary
Server
Cat Jet-
Stream
OPeNDAP
4. Deliver Data
JetStream Summary
BrowserHTTP
Downloads
OPeNDAP
Jet-StreamC
at
Server
Jet-Stream
5. Analyse
HTTPServer
OPeNDAP
Cat
Browser
JetStream Summary
Client Query based on Metadata
• A typical implementation of JetStream’s web-page interface would allow a Client to request datasets which meet criteria …– Spatial extents (Lat/Long)– Primary theme– Depending on the theme, a second level of
criteria … e.g. Date>1990, Spacing<300
Some Australian JetStream sites
• The SARIG Site is hosted on a MS/Windows environment and integrated with their arcIMS application
• The GA JetStream Site is hosted on a dual Solaris system and has been integrated with their MapServer based web application – based on ROSA applette
The SARIG site
Use sophistication of an existing interface
Grid data
- report metadata- a preview bitmap
- choose download?
JETSTREAM at GeoSceince Australia
The GA site
The GA site (Cont)
The GA site (Cont)
The GA site (Cont)
The GA site (Cont)
Current Sites (Cont)Site Amount of
online data
URL
SARIG 7 Gbytes https://info.pir.sa.gov.au/geoserver/sarig/frameSet.jsp
GA 44 Gbytes http://www.ga.gov.au/bin/mapserv36?map=/public/http/www/jetstream/jetstream.map&mode=browse
Intrepid 2 Gbytes http://www2.dfa.com.au/jetstream
JetStream - Data Management Issues (1)
• JetStream can be integrated with any existing system of data management– Requires a catalogue … which can be
generated (and refreshed) by …• Either gathering the relevant metadata from the
dataset’s metadata files• Or (with customisation) by drawing on the
metadata contained within existing Departmental databases
JetStream - Data Management Issues (1a)
Metadata stored with the data …
JetStream - Data Management Issues (1b)
Metadata stored in a database …
GDADS
Located Data Management
• JetStream’s management of ‘located data issues’ occurs on two levels …
– Firstly, dataset extents are fields in the catalogue, so the initial search-for-data simply interrogates the catalogue to find datasets located within a defined are of search
– Then either whole-datasets or subsets-of- datasets can be delivered. The data-serving organisation may make this choice … or allow their clients the ability to make that choice.
System Administration Back Office
• Customisable catalogue• Configurable data-delivery – add licence
files, disclaimer files, associated report files, etc.
• Job queue management …– Priorities– Abort– Report/log – record of what was downloaded
• Automatic metadata creation and collection
System Administration cont.
• Provides User with Time estimates
• Reporting of total zip file size
• Site Administrator Logging of requests
• Site Administrator Queue Management
• Uses Tomcat and Apache as well as JSP to create and manage individual sessions.
Downloads (1)
• JetStream uses the industry-standard file download protocols of the web …
• Choice of HTTP or FTP protocols configurable in JetStream
• Simply click on a URL … and initiate a routine save-file-to-disk download!
• For a more complex request … receive an emailed URL link when data are ready
Downloads (2) …Don’t wait … receive an email when ready
Downloads (3) …… then a standard file download
Downloads
• The client …– Can be on any web-enabled computer
system (Windows, Linux, UNIX, Mac, …)– Minimal client needs …
• any browser• an un-zipper … WinZip, pkunzip, gzip, jar
– Download, unzip … and use your preferred spatial analysis software (GIS, Profile Analyst, etc.)
Compression Ratios
1. Choose the subset you need … do not download the entire dataset
– Raster: Spatial subset– Vector: Selected fields
2. WinZip … industry standard• Raster (ERMapper) up to 50%• Vector (ASCII) … up to 94%
Export File Formats (1)
• Current implementations …– Raster: ERMapper binary file format– Vector: ASCII in ASEG GDF2 format
• Potential implementations …– Rapidly implement any of the formats that
are currently available in Intrepid …– Working on SWATH & SEGY
Export File Formats (2)
• Vector …– ASCII ASEG GDF2 format– Binary: Intrepid and Geosoft *.gdb– ASCII: XYZ, CSV– GIS: Arc Shape and MapInfo TAB / MIF– Others … ECS, Geosolutions
Export File Formats (3)
• Raster (Grid) …– ERMapper binary format– Binary: Geosoft *.grd, NetCDF– ASCII: GXF, Image & Image_XYZ– GIS: ESRI format bil-file, MapInfo GeoTiff– Others … AGSO, ECS, Geopak, GIPSI,
LCT, Geosolutions, ZMap
Projection Conversion
• Two main options …– Compute-on-the-fly by the server data-
extraction process – extract (subset), projection conversion, reformat to requested format, zip and place in the ‘to-be-delivered’ area.
– Pre-computed• Additional (X, Y) fields pre-computed in vector
datasets• A choice of 2 or 3 prepared grid file options
Thick Client Support
• openDAP compliance allows the possibility of direct access to the Jetstream Catalog
• openDAP is an Oceanographic sponsored data standard
• The JetStream Engine can support direct and immediate delivery of data into an application.
Thick Client Support (Cont)
• For example, the Intrepid Project Manager can browse the JetStream catalogs directly
• Add facility for a base gif so that the catalog can be viewed visually as well as a table
• An Intrepid Data Extraction wizard guides the user through the same kind of dialog as the existing web browser based JetStream sites
Thick Client Support (Cont)
• The Data Extract can “register” its capabilities so that it can do some of the processing.
• Data formatting and Coordinate System conversions for example
• JetStream Site administrators decide whether or not this type of service is to be supported for their site.
Thick Client Support (Cont)
Thick Client Support (Cont)
JetStream Advantages (1)
• Your clients see your web-interface … which will typically be highly integrated with other elements of your web-interface to the world …
… and thus you retain control of the corporate image that your organisation projects to your client-base.
JetStream Advantages (2)
• System independent …– Server-side: JetStream is currently
implemented on Windows 2000, Linux and Sun Solaris servers
– Client-side: Your clients can log in to your site from any computer with a web-browser … ensuring the widest possible public access
JetStream Advantages (3)
• No proprietary software …
– Client access does not require proprietary software. Public client access to datasets is achieved by anyone with a web-browser and an un-zipper …
… after which the client can proceed to use datasets within their preferred spatial analysis package (GIS, imaging, processing, etc.)
JetStream Advantages (4)
• Based on Open Standards …
… and proven industry standards
– JetStream uses an OPeNDAP compatible catalogue
– Can serve OPeNDAP compatible datasets– Can serve data from a range of industry-
standard file formats (Intrepid, Geosoft)– Can serve data directly from an organisations’
corporate relational databases (Oracle, etc.)
JetStream Advantages (5)
• Based on Open Standards … … and proven industry
standards
– Can deliver data in a wide range of industry-standard file formats …
– Uses the universally available zip for compression
– Allows a wide range of Datum/Projection options
JetStream Advantages (6)
• Based on Open Standards …
… and proven industry standards
– Allows the serving organisation to develop standard server-client web pages, with all the conventional fire-wall security of any web-server system
– Uses standard HTTP or FTP protocols for file-download data delivery
The End
Thank You
Client Query – Further Choices (1)
• Having retrieved a list of dataset objects which match the search criteria, the Client would then make further choices …– Retrieve or don’t retrieve– Choose subset ‘channels’ of a dataset– Choose the spatial subset – or the whole dataset– Generate derived ‘products’ from the dataset
JetStream Example (vector data)- report metadata- a preview bitmap?- select fields
- choose download?
Client Query – Further Choices (2)
• The Client may also be given further options, such as …– A choice of Datum & Projection– File format choices … Intrepid, Oasis ‘gdb’,
ASCII, GIS formats (ArcView, MapInfo), various grid formats, etc.
Don’t wait … receive an email when ready
… then a standard file download
Streaming Data Delivery (1)
• Currently JetStream does not use a data-streaming delivery mechanism.
• At the present time we see strong advantages in making use of various industry-standard communications and delivery protocols
• Server Organisation: Standard web-pages are readily developed and integrated into an organisations’ existing web-interface
• Client: Does not have to stay on line and wait. Data can be prepared, and then delivered, as background processes.
Streaming Data Delivery (2)
• As bandwidth continues to improve, it is clear that ‘streaming’ data delivery will become practical even for huge datasets …
… and Intrepid Geophysics would see the introduction of ‘streaming’ in terms of emerging standards for streaming data protocols … and developing to both server capability and client-side applications to embrace such standards