Federated Network for Sharing Air Quality Data and Processing Services Center for Air Pollution Impact and Trend Analysis (CAPITA) Washington University, St. Louis, MO 63130 April 2005, [email protected]DRAFT Project Coordinators: Software Architecture: R. Husar Software Implementation: K. Höijärvi Data and Applications: S. Falke, R. Husar
21
Embed
Federated Network for Sharing Air Quality Data and Processing Services Center for Air Pollution Impact and Trend Analysis (CAPITA) Washington University,
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Federated Network for Sharing Air Quality Data and Processing Services
Center for Air Pollution Impact and Trend Analysis (CAPITA)Washington University, St. Louis, MO 63130
National Emissions Local Inventory Satellite Fire Locs
Status and Trends
AQ Compliance
Exposure Assess.
Network Assess.
Tracking Progress
AQ Management Reports
‘Knowledge’ Derived from Data
Primary Data Diverse Providers
Data ‘Refining’ Processes Filtering, Aggregation, Fusion
Loosely Coupled InfoSystems: Flow of Data and Flow of Control
Provider Push User Pull
Each management task can has a need for ‘actionable’ information that can be used for decision making. Thus, ideally, the consumers/managers should specify their information needs and other features of the supporting Infosystem However, they may not be fully aware of the available info resources and technologies, particularly in fast-changing conditions.
The information resources and tools are supplied by the data providers, custodians or integrator-mediatorsProviders and custodians can help ‘pushing’ the information toward the consumers by making it accessible and attractive to the usersHowever, the choice of which information is actually used is made by the by the consumer
Thus, data consumers, providers and mediators together form the info system
Flow of DataFlow of Control
AQ DATA
METEOROLOGY
EMISSIONS DATA
Informing Public
AQ Compliance
Status and Trends
Network Assess.
Tracking Progress
Data to Knowledge Transformation
DataFed Description
DataFed VisionBetter air quality management and science through by effective use of relevant data
DataFed GoalsFacilitate the access and flow of atmospheric data from provider to usersSupport the development of user-driven data processing value chainsParticipate in specific application projects
Approach: Mediation Between Users and Data ProvidersDataFed assumes spontaneous, autonomous emergence of AQ data (a la Internet)Non-intrusively wraps datasets for access by web servicesWS-based mediators provide homogeneous data views e.g. geo-spatial, time...
End-user programming of data access and processing through WS composition (limited)
ApplicationsBuilding browsers and analysis tools for distributed monitoring data Serve as data gateway for user programs; web pages, GIS, science toolsDataFed is currently focused on the mediation of air quality data
Mediator-Based Integration Architecture (Wiederhold, 1992) • The job of the mediator is to provide an answer to a user query (Ullman, 1997)
• In database theory sense, a mediator is a view of the data found in one or more sources • Heterogeneous sources are wrapped by translation software local to global language• Mediators (web services) obtain data from wrappers or other mediators and process it …
DataFed Multidimensional Data Model4 D Geo-Environmental Data Cube (X, Y, Z, T)
Environmental data represent measurements in the physical world which has space (X, Y, Z) and time (T) as its dimensions.
The specific inherent dimensions for geo-environmental data are: Longitude X, Latitude Y, Elevation Z and DateTime T.
The needs for finding, sharing and integration of geo-environmental data requires that data are ‘coded’ in this 4D data space – at the minimum.
DataFed SoftwareSoftware for the User
Data Catalog for finding and browsing the metadata of registered datasetsDataset Viewer/Editor for browsing specific datasets, linked to the CatalogData Views - geo-spatial, time, trajectory etc. views prepared by the userConsoles, collections of views on a web page for monitoring multiple datasetsMini-Apps, small web-programs using chained web services (e.g. CATT, PLUME)
Software for the DeveloperRegistration software for adding distributed datasets to the data federationWeb services for executing data access, processing and rendering tasksWeb service chaining facility for composing custom-designed data views
DataFed Technologies and ArchitectureForm-based, semi-automatic, third-party wrapping of distributed dataWeb services (based web standards) for the execution of specific tasksService Oriented Architecture for building loosely coupled application programs
Software IssuesReliability: Distributed computing issues: network reliability, bandwidth, etcChaining: Orchestrating distributed web services to act as a single applicationLinks: Linking users to providers and other federations (e.g. OGC, OPenDAP)
Anatomy of a Wrapper Service: TOMS Satellite Image Data
• Given the URL template and the image description, the wrapper service can access the image for any day, any spatial subset using a HTTP URL or SOAP protocol:
• Wrapper classes are available for geo-spatial (incl. satellite) images, SQL servers, text files,etc. The mediator classes are implemented as web services for uniform data access, transformation and portrayal.
• The web-program consists of a stable core and adoptive input/output layers• The core maintains the state and executes the data selection, access and render services• The adoptive, abstract I/O layers connects the core to evolving web data, flexible displays and to the a
configurable user interface:– Wrappers encapsulate the heterogeneous external data sources and homogenize the access– Device Drivers translate generic, abstract graphic objects to specific devices and formats – Ports connect the internal parameters of the program to external controls– WDSL web service description documents
Data Sources
Controls
Displays
I/O Layer
Dev
ice
Dri
vers
Wra
pp
ers App State Data
Flow Interpreter
Core
Web Services
WSDL
Ports
Datasets Used in FASTNET
• Data are accessed from autonomous, distributed providers• DataFed ‘wrappers’ provide uniform geo-time referencing• Tools allow space/time overlay, comparisons and fusion
Near Real Time Data IntegrationDelayed Data Integration
A Sample of Datasets Accessible through ESIP MediationNear Real Time (~ day)
It has been demonstrated (project FASTNET) that these and other datasets can be accessed, repackaged and delivered by AIRNow through ‘Consoles’
MODIS Reflectance
MODIS AOT TOMS Index
GOES AOT
GOES 1km Reflec
NEXTRAD Radar
MODIS Fire Pix
NRL MODEL
NWS Surf Wind, Bext
FASTNET:
Inter-RPO pilot project, through NESCAUM, 2004
Web-based data, tools for community use
Built on DataFed infra-structure, NSF, NASA
Project fate depends on sponsor, user evaluation
Some of the Tools Used in FASTNET
– Data Catalog– Data Browser– PlumeSim, Animator– Combined Aerosol Trajectory Tool (CATT)
Consoles: Data from diverse sources are displayed to create a rich context for exploration and analysis
CATT: Combined Aerosol Trajectory Tool for the browsing backtrajectories for specified chemical conditions
Viewer: General purpose spatio-temporal data browser and view editor applicable for all DataFed datasets
Midwest HazeCam Image ConsoleImage Archive and Browser
• Hourly Midwest HazeCam Images are archived by DataFed data access system• Archived images for all cameras can be browsed through this console• HazeCam URL for a day: http://www.datafed.net/consoles/MWH_WebCams.asp?image_width=400&image_height=300&datetime=2005-01-31T13:00:00
• URL for a site and day: http://webapps.datafed.net/datasets/webcam/cincinnati/20050131-13mwhcincinnati.jpg
• URLs can be embedded as links into emails, bookmarks, web pages, PPT and PDF files.
Midwest HazeCam Image Browser
Select date and time Set image size and time MW HazeCam ConsoleOther FASTNET
Consoles
Aerosol Event Catalog: Web
pages
• Catalog of generic ‘web objects’ – pages, images, animations that relate to aerosol events
• Each ‘web object’ is cataloged by location, time and aerosol type.
Distribution of ResponsibilityDistribution of Responsibility
Distributed Distributed Responsibility in OpenDAP in OpenDAP
The data lies with the data providersThe data access protocol lies with OPeNDAPApplication programs with the developers (Matlab, .. Excel…)Data discovery with the GCMD and NVODS
The data lies with the data providersThe wrappers and mediators with DataFed communityApplication programs with end user Data discovery with data & service registries