Top Banner
Unidata’s Common Data Model and the THREDDS Data Server John Caron Unidata/UCAR, Boulder CO Jan 6, 2006 ESIP Winter 2006
53

Unidata’s Common Data Model and the THREDDS Data Server

Jan 14, 2016

Download

Documents

Marijana

Unidata’s Common Data Model and the THREDDS Data Server. John Caron Unidata/UCAR, Boulder CO Jan 6, 2006 ESIP Winter 2006. Outline. Definitions Creating a Common Data (Access) Model from NetCDF, HDF5, OPeNDAP CDM Coordinate Systems, Data Types CDM implementation - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Unidata’s Common Data Model and the THREDDS Data Server

Unidata’sCommon Data Model

and theTHREDDS Data Server

John CaronUnidata/UCAR, Boulder CO

Jan 6, 2006ESIP Winter 2006

Page 2: Unidata’s Common Data Model and the THREDDS Data Server

Outline

• Definitions

• Creating a Common Data (Access) Model from NetCDF, HDF5, OPeNDAP

• CDM Coordinate Systems, Data Types

• CDM implementation

• NetCDF Markup Language (NcML)

• The THREDDS Data Server

Page 3: Unidata’s Common Data Model and the THREDDS Data Server

NetCDF-3

• Machine and OS independent file format for “self-describing” scientific data

• C library (Fortran, C++, Perl, IDL, MatLab, Python, Ruby), Java library

• Efficient subsetting of multidimensional arrays.

• > 20,000 downloads last year

Page 4: Unidata’s Common Data Model and the THREDDS Data Server

HDF5

• Machine and OS independent file format for “self-describing” scientific data

• C library (Fortran, Java, PyTables)• Evolution from HDF4, but different.• HDF-EOS, HDF5-EOS, standard formats

for EOSDIS, ASCI, NPOESS• Parallel-IO, chunked storage, compression

filters, many data types. • Developed at NCSA, now independent

Page 5: Unidata’s Common Data Model and the THREDDS Data Server

NetCDF-4

• Project funded by NASA to create new version of netCDF using the HDF5 file format.

• “Extend and merge” netCDF and HDF5– Widespread use and simplicity of netCDF– Generality and performance of HDF5

Page 6: Unidata’s Common Data Model and the THREDDS Data Server

NetCDF-Java 2.2 (nj22)

• 100% Java library

• Prototype implementation of CDM

• File formats:– General: NetCDF, HDF5, OPeNDAP– Grids: GRIB1, GRIB2– Radar: NEXRAD, NIDS, DORADE– Satellite: DMSP, GINI

• Access to THREDDS catalogs

Page 7: Unidata’s Common Data Model and the THREDDS Data Server

OPeNDAP

• Client-server protocol for scientific data access

• C++ client and server, Java client and server libraries.

• Current version 2.0; NASA ESE standard

• Working on new 4.0 protocol spec

Page 8: Unidata’s Common Data Model and the THREDDS Data Server

THREDDS

• Originally funded by NSDL – “discovery and use of scientific data”– Middleware between data providers and users– Dataset Inventory Catalogs (XML)

• Now part of Unidata core funding– Data Serving (pull)

Page 9: Unidata’s Common Data Model and the THREDDS Data Server

What’s a Data Model?

• Its about scientific data: storing, accessing

• It’s an abstraction

• Equivalent to an abstract object model in OOP

• An Abstract Data Model describes data objects and what methods you can use on them

Page 10: Unidata’s Common Data Model and the THREDDS Data Server

What’s a Data Model?

• An API is the interface to the Data Model for a specific programming language

• A file format is a way to persist the objects in the Data Model.

• A data access protocol plays the role of a file format.

• The Abstract Data Model removes the details of any particular API and the persistence format.

Page 11: Unidata’s Common Data Model and the THREDDS Data Server

Creating a Common Data Access Model

from NetCDF, HDF5, OPeNDAP

Page 12: Unidata’s Common Data Model and the THREDDS Data Server

NetCDF-3DataModel

Page 13: Unidata’s Common Data Model and the THREDDS Data Server

OPeNDAPData

Model(DAP-2)

Page 14: Unidata’s Common Data Model and the THREDDS Data Server

HDF5DataModel

Page 15: Unidata’s Common Data Model and the THREDDS Data Server

CommonData(Access) Model

Page 16: Unidata’s Common Data Model and the THREDDS Data Server

Coordinate Systemsand Scientific Data Types

Page 17: Unidata’s Common Data Model and the THREDDS Data Server

Coordinate Systems

Common Data Model Layers

Data Access

Scientific Datatypes

Grid

Point

Radial

Trajectory

Swath

Station

Page 18: Unidata’s Common Data Model and the THREDDS Data Server

Coordinate Systems needed

• NetCDF, OPeNDAP, HDF data models do not have integrated coordinate systems– so georeferencing not part of API– Need conventions to specify (eg CF-1,

COARDS, etc)

• Contrast GRIB, HDF-EOS, other specialized formats

• Must be done in a general way

Page 19: Unidata’s Common Data Model and the THREDDS Data Server

• Same underlying mathematics as VisAD, ASCII

Coordinate Systems

Page 20: Unidata’s Common Data Model and the THREDDS Data Server

Scientific DataTypes

• Based on datasets Unidata is familiar with– APIs are evolving

• How are data points connected?• Intended to scale to large, multifile

collections• Intended to support “specialized queries”

– Space, Time

• Corresponding “standard” NetCDF file conventions

Page 21: Unidata’s Common Data Model and the THREDDS Data Server

Point Observation Data

Page 22: Unidata’s Common Data Model and the THREDDS Data Server

PointObsDataset Methods

// Collection of StructureData

Collection getData(

LatLonRect boundingBox,

Date start, Date end);

Page 23: Unidata’s Common Data Model and the THREDDS Data Server

Trajectory Data

Page 24: Unidata’s Common Data Model and the THREDDS Data Server

TrajectoryObs Methods

int getNumPoints();

StructureData getData(int point);

Page 25: Unidata’s Common Data Model and the THREDDS Data Server

Station Data

Page 26: Unidata’s Common Data Model and the THREDDS Data Server

StationObs Methods

// return List of Station

List getStations();

// return List of StructureData

List getData(

Station s,

Date start, Date end);

Page 27: Unidata’s Common Data Model and the THREDDS Data Server

Radial Data

Page 28: Unidata’s Common Data Model and the THREDDS Data Server

Radial methods

interface Radial { int getNumGates(); float getData(int gate);

float getStartingGate(); float getGateSize(); float getElevation(); float getAzimuth(); double getTime();}

Page 29: Unidata’s Common Data Model and the THREDDS Data Server

Gridded Data

Page 30: Unidata’s Common Data Model and the THREDDS Data Server

Grid methods

interface GridCoordSys {

CoordinateAxis getTaxis();

CoordinateAxis getXaxis();

CoordinateAxis getYaxis();

CoordinateAxis getZaxis();

Projection getProjection();

}

Array getDataCube(Range time, Range z, Range y, Range x);

Page 31: Unidata’s Common Data Model and the THREDDS Data Server

Image/Swath

Page 32: Unidata’s Common Data Model and the THREDDS Data Server

Standardizing NetCDF Formats

• Grid: CF-1 Convention– Need improvements for regional models

(WRF), GIS info

• Radar: “Radar Exchange Format”– With radar community (led by NCAR ATD)

• Point Observations– Unidata Observation Dataset Conventions

Page 33: Unidata’s Common Data Model and the THREDDS Data Server

CDM implementations: NetCDF-4 and NetCDF-Java 2.2

Page 34: Unidata’s Common Data Model and the THREDDS Data Server

34

NetCDF-4

C

Library

HDF5 Library

netCDF-4 Library

netCDF-3Interface

NetCDF-4 C Library

Page 35: Unidata’s Common Data Model and the THREDDS Data Server

NetCDF-4 Status

• 4.0 Beta implements CDM access layer– complete, but waiting for HDF5 release 1.8 to

finalize file format

• 4.1: adding Coordinate Systems

• 4.?: merge OPeNDAP access (pending funding)

Page 36: Unidata’s Common Data Model and the THREDDS Data Server

NetCDF-Java 2.2 (nj22)

• Prototype implementation of CDM

• File formats:– General: NetCDF, HDF5, OPeNDAP– Grids: GRIB1, GRIB2– Radar: NEXRAD, NIDS, DORADE– Satellite: DMSP, GINI

• Access to THREDDS catalogs

• Implements NcML

Page 37: Unidata’s Common Data Model and the THREDDS Data Server

Coordinate Systems

Common Data Model

Data Access

Scientific Datatypes

Grid

Point

Radial

Trajectory

Swath

Station

Page 38: Unidata’s Common Data Model and the THREDDS Data Server

NetcdfDataset

ApplicationScientific Datatypes

NetCDF-Java version 2.2 architecture

OPeNDAPTHREDDS

Catalog.xml

NetCDF-3

HDF5

I/O service provider

GRIB

GINI

NIDS

NetcdfFile

NetCDF-4

…Nexrad

DSMP

CoordSystem Builder

Datatype Adapter

ADDE

Page 39: Unidata’s Common Data Model and the THREDDS Data Server

NetCDF-Java 2.2 Status

• Data Access layer: Beta quality– also waiting for HDF5 release to finish

NetCDF-4, commit to API

• Coordinate Systems: early Beta– Finishing docs, runtime plugability

• Data Types: Alpha, still experimenting with APIs

Page 40: Unidata’s Common Data Model and the THREDDS Data Server

NetCDF Markup Language (NcML)

• XML representation of netCDF metadata (like ncdump -h)

• Create new netCDF files (like ncgen)

• Modify existing datasets– Add/delete/rename – Create logical sections of existing variables.

• Create unions and aggregations of multiple existing datasets.

Page 41: Unidata’s Common Data Model and the THREDDS Data Server

<?xml version="1.0" encoding="UTF-8"?>

<netcdf xmlns="http://www.unidata.ucar.edu/schemas/netcdf/ncml-2.2" location=“/data/nids/N0R_20041119_2147">

<attribute name=“DataType" value=“Radar" /> <remove type=“attribute” name=“password" /> <variable name="Reflectivity" orgName=“R34768”> <attribute name="units" value=“dBZ" /> </variable>

</netcdf>

NcML example

Page 42: Unidata’s Common Data Model and the THREDDS Data Server

NcML Aggregation

• Union

• Join Existing

• Join New

• Forecast Model Run

+ + =

+ =

Page 43: Unidata’s Common Data Model and the THREDDS Data Server

NcML Aggregation Example

<netcdf xmlns=“http://www.unidata.ucar.edu/schemas/netcdf/ncml-2.2”>

<aggregation dimName="time" type="joinNew">

<variableAgg name="Temperature"/>

<variableAgg name="Pressure"/>

<scan location=“C:/data/goes/" suffix=".gini"/>

</aggregation>

</netcdf>

Page 44: Unidata’s Common Data Model and the THREDDS Data Server

THREDDS Data Server

• Integrates data access with THREDDS catalogs and services

• Tomcat/Servlet, 100% Java, single war file

• Data input is netCDF Java 2.2 library

• Data output:– OPeNDAP – HTTP Server– OGC Web Coverage Server (gridded)

Page 45: Unidata’s Common Data Model and the THREDDS Data Server

HTTP Tomcat Server

THREDDS Data Server

Datasets

Catalog.xml

hostname.edu

THREDDS ServerApplication

NetCDF-Javalibrary

IDD Data

•OPeNDAP

•HTTPServer

•WCS

Page 46: Unidata’s Common Data Model and the THREDDS Data Server

HTTP Tomcat Server

TDS as WCS Gateway

Catalog.xml

hostname.edu

THREDDS ServerApplication

NetCDF-Javalibrary

•OPeNDAP

•HTTPServer

•WCS

OPeNDAP ServeranotherHost.org

Page 47: Unidata’s Common Data Model and the THREDDS Data Server

HTTP Tomcat Server

TDS and NcML

Catalog.xml

hostname.edu

THREDDS Server Application

Netcdf-Java

•OPeNDAP

Datasets

NcML

•WCS

Page 48: Unidata’s Common Data Model and the THREDDS Data Server

TDS and NcML

• Server serves the dataset “wrapped” by the NcML– Client sees OPeNDAP or WCS, not NcML

• Can “fix” metadata problems

• Can augment metadata

• Use NcML aggregation on the TDS– replaces the old “Aggregation Server”

Page 49: Unidata’s Common Data Model and the THREDDS Data Server

HTTP Tomcat Server

TDS and Digital Libraries

Datasets

Catalog.xml

otherhost.gov

THREDDS ServerApplication

NetCDF-Javalibrary

•OPeNDAP

•HTTPServer

•WCS

OPeNDAP Server

hostname.edu

OAI HarvesterDL Records

Page 50: Unidata’s Common Data Model and the THREDDS Data Server

TDS and Digital Libraries

• Framework to add metadata– By hand (collection level)– Automatic extraction from datasets

• Send records to existing DLs– No search

• Both collection and inventory level

Page 51: Unidata’s Common Data Model and the THREDDS Data Server

Future Plans

• NetCDF-Java– Get API’s stable, docs, runtime plugability– NetCDF-4 (!)– HDF4, HDF-EOS, BUFR (need funding)

• NetCDF-4 C Library– DataTypes too immature to port– NcML?– Java on the server

Page 52: Unidata’s Common Data Model and the THREDDS Data Server

TDS Future Plans

• Aggregation– Driven by IDD data (motherlode)

• Pluggable Authorization• access control by dataset• Performance• Services

– Coordinate System Verifier (eg CF-1)– Data access– Subset and get netcdf file

Page 53: Unidata’s Common Data Model and the THREDDS Data Server

File Format#N

File Format#2

File Format#1

CDM

Visualization&Analysis

ConclusionN + M instead of N * M things on your TODO List!

NetCDF file

OpenDAP Server

WCS Service