AUKEGGSWorkshop ANU, Canberra, 29 November 2006 Implementing CSML Feature Types in applications within the NERC DataGrid Dominic Lowe, British Atmospheric.

Post on 14-Jan-2016

212 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

AUKEGGS WorkshopANU, Canberra, 29 November 2006

Implementing CSML Feature Types in applications within the NERC DataGrid

Dominic Lowe, British Atmospheric Data Centre, CCLRC

and NDG team (Andrew Woolf, Bryan Lawrence et al.)

AUKEGGS WorkshopANU, Canberra, 29 November 2006

http://ndg.nerc.ac.uk

Complexity + Volume + Remote Access = Grid Challenge

British Atmospheric Data Centre

British Oceanographic Data Centre

NCAR

NERC DataGrid

AUKEGGS WorkshopANU, Canberra, 29 November 2006

CSML Feature Model

• Intermediary XML data format

• Represents Atmospheric, Oceanographic data

• GML Application schema

• Features based on geometery e.g

• GridSeriesFeature

• PointFeature

• TrajectoryFeature

• ...

AUKEGGS WorkshopANU, Canberra, 29 November 2006

AUKEGGS WorkshopANU, Canberra, 29 November 2006

CSML Parser

XMLElements

<GridFeature>...

</GridFeature>

PythonClasses

class GridFeature:...

Provides mappings (and conversion) between XML model and python object model.

toXML()

fromXML()

Implemented in Python, uses c module: cElementTree Performance:

Length of XML (lines) Time to parse (CPU secs)87 0.010152 0.010369 0.04011,482 0.570

AUKEGGS WorkshopANU, Canberra, 29 November 2006

CSML Parser

Hierarchical - Calling fromXML() or toXML() method of root element calls methods of child elements.

<Dataset> <FeatureCollection> <GridSeriesFeature>

<rangeSet>...</rangeset><domain>...</domain><...></...>

</GridSeriesFeature>and so on ...

</FeatureCollection><Dataset>

Dataset.toXML()FeatureCollection.toXML()GridSeriesFeature(AbstractFeature).toXML()RangeSet.toXML()..... and so on

AUKEGGS WorkshopANU, Canberra, 29 November 2006

CSML Parser Process whole document or parts of the document

To convert whole document:

From XML to python:tree = ElementTree(file='mycsmlfile.xml')dataset = csml.parser.Dataset()dataset.fromXML(tree.getroot())

From Python to XML:csmldoc = dataset.toXML()

Or just convert a fragment: gsFeature = GridSeriesFeature() gsFeature.fromXML(xmlFragment) gsXML = gsFeature.toXML()

Allows for easy editing of documents Or addition of new features Or redefining existing features e.g. subsetting

AUKEGGS WorkshopANU, Canberra, 29 November 2006

CSML Parser – Creating CSML Can be used to create CSML documents (or fragments) from scratch:

gs = GridSeriesFeature(id = 'myGS' domain = mydomain, rangeSet = myRS)

ps = PointFeature(id = 'mypoint', domain = mydomain2, rangeSet= myrs2)

fc = FeatureCollection(members = [gs,ps, ...]

ds = Dataset(id ='mycsmldocument', featureCollection = fc)

ds.toXML() mycsmldocument.xml

No need for data providers to understand XML APIs

Very useful for performing operations on features

AUKEGGS WorkshopANU, Canberra, 29 November 2006

CSML Parser

2 level API to parser object model:

Parser level (mainly used by me + NDG data providers):

dataset.featureCollection.members[3].profileSeriesDomain

Wrapped by higher level interface (used by client applications)

getListOfFeatures(csmldoc) # get list of available featuresgetDomain(feature) # get domain infogetAffordances(feature) # get available operationssubsetFeature(feature, subset) # operation:request subset

AUKEGGS WorkshopANU, Canberra, 29 November 2006

Access to underlying data Multiple I/O libraries - cdms, NAppy, others... CSML code talks to a single DataInterface class that provides a uniform wrapper for different file access methods.

Easy to add more data formats - just need to write the correct wrapper methods (getData, getSubsetOfData, getVariable ...)

Similar interface needed for RDBMS access (not yet implemented)

AUKEGGS WorkshopANU, Canberra, 29 November 2006

AUKEGGS WorkshopANU, Canberra, 29 November 2006

CSML tooling - Use Case #1Scanning at BADC

Multiple data formats, NetCDF, NASAAmes, GRIB, PP

Feature identification challenges

Scanner has concept of a FeatureFileMap + Config options

Creates parser objects, then calls csml.parser.dataset.toXML() to create document.

By using parser, does not have to worry about XML details.

AUKEGGS WorkshopANU, Canberra, 29 November 2006

CSML tooling - Use Case #2Scanning at BODC

Metadata in Oracle Database.

Python-Oracle link to extract metadata

Create parser objects, then call toXML()

By using parser, does not have to worry about XML details.

AUKEGGS WorkshopANU, Canberra, 29 November 2006

CSML tooling - Use Case #3Subsetting operation

Query CSML document, getFeatureList() etc.

Subset CSML dataset, return CSML document + new netcdf file

Subset multiple datasets return CSML document describing both + netcdf files

Subset datasets from different data providers and supply in single CSML file + netcdf files

All simplified by use of parser 'objects'.

AUKEGGS WorkshopANU, Canberra, 29 November 2006

CSML tooling - Use Case #4CSML Updates

Datasets change!

BADC has automatic ingest scripts.

Datasets change often, and without warning!

Feasible to automate metadata updates:

Using parser to update existing CSML document when dataset changes Or rescanning dataset periodically and write new CSML document

Not implemented btw...

AUKEGGS WorkshopANU, Canberra, 29 November 2006

CSML tooling - Use Case #5NDG power user

Writes bespoke python scripts to access data

e.g. For a dataset that is updated daily: Uses the CSML API to download a subset every day eg. Temperature at certain locations.

AUKEGGS WorkshopANU, Canberra, 29 November 2006

CSML tooling - Use Case #6Integrate CSML into Applications

High Level API easy to use

Integrated with: BADC DataExtractor TPAC WCS Meteorologisk institutt (Norway)

AUKEGGS WorkshopANU, Canberra, 29 November 2006

BADC Data Extractor

AUKEGGS WorkshopANU, Canberra, 29 November 2006

TPAC WCS

AUKEGGS WorkshopANU, Canberra, 29 November 2006

Norwegian Met Office

AUKEGGS WorkshopANU, Canberra, 29 November 2006

Summary

• Modular set of tools

• “Features as Objects” instead of just XML

• Many use cases simplified by object model

• High level API – easy integration of features with applications

• CSML v2 parser under development; more sophisticated than v1 parser & may be adaptable for other domains.

top related