Top Banner
NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1
29

NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

Jan 11, 2016

Download

Documents

Laurence Shaw
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

NOAA Data Management Activities

Deirdre Jones, EDMC ChairJeff de La Beaujardière, DM Architect

Prepared for DAARWG 2011-11-15

1

Page 2: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

Outline

• Motivation• Recent EDMC Accomplishments• EDMC FY2012 Plans• DM Framework in NEO Strategy• Data catalog approaches

2

Page 3: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

Motivation

• NOAA Strategic Plan calls for:– Improved data interoperability and usability

through application and use of common data management standards

– Enhanced access and use of environmental data through data storage and access solutions, integration of systems, and long-term stewardship

– Increased volume and diversity of data and information effectively integrated into models

3

Page 4: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

New EDMC Procedural Directives• Data Management Planning• Directs managers of all projects and systems

that produce data to write DM Plans• Data Documentation–Directs NOAA programs to provide data

documentation (metadata)• Data Sharing by NOAA Grantees–Directs NOAA grantees to make their data

publicly availableAll 3 are agenda topics for tomorrow 4

Page 5: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

EDMC Plans for FY2012 (1/2)• Implement approved procedural directives– EDMC developing detailed work plan– Further discussion tomorrow

• Begin to develop additional Procedural Directives– Data Access and Discovery• Goal: Enable users to find and retrieve NOAA data• Goal: Automate publication of NOAA data to data.gov and

GEOSS

– Data Citation• Goal: Enable datasets to be referenced by unique identifier

to provide credit, enable usage metrics, and distinguish duplicates

5

Page 6: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

EDMC Plans for FY2012 (2/2)• Hold 3rd annual NOAA wide EDM Conference‐– To engage stakeholders

• Host OGC Workshop– Coordination on data access standards

• Support DAARWG Meetings (twice annually)– To receive guidance from advisory board

• Support development of Archive Concept of Operations– Called for in CLASS External Review– Briefing after lunch today

6

Page 7: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

Data Management Frameworkfrom

National Earth Observations (NEO) Strategy, ch. 4 (inter-agency draft)

Jeff de La Beaujardière, PhDNOAA DM Architect

7

Page 8: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

Data Management Framework

Principles

Governance

Architecture

Standards

Assessment

Dat

a Li

fecy

cle

Principles• Full and Open Access• Preservation• Information Quality• Ease of Use

8

Dat

a Li

fecy

cle

Dat

a Li

fecy

cle

Dat

a Li

fecy

cle

Dat

a Li

fecy

cle

from National Earth Observations (NEO) Strategy- Data Management Chapter (in preparation 2011)

Page 9: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

Data Lifecycle

Planning and ProductionActivities

Data ManagementActivities

UsageActivities

9

from National Earth Observations (NEO) Strategy- Data Management Chapter (in preparation 2011)

Page 10: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

Dat

a L

ifec

ycle

UsageActivities

DataManagementActivities

Planning andProductionActivities

CollectionProcessing

Quality ControlDocumentation

CatalogingDisseminationPreservationStewardship

Usage TrackingFinal Disposition

DiscoveryReceptionAnalysis

Product GenerationUser Feedback

CitationTagging

Gap Assessment

Requirements DefinitionPlanning

DevelopmentDeploymentOperations

10

from NEO Strategy - DM Chapter(in prep. 2011)

Page 11: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

Dat

a L

ifec

ycle

UsageActivities

DataManagementActivities

Planning andProductionActivities

CollectionProcessing

Quality ControlDocumentation

CatalogingDisseminationPreservationStewardship

Usage TrackingFinal Disposition

DiscoveryReceptionAnalysis

Product GenerationUser Feedback

CitationTagging

Gap Assessment

Requirements DefinitionPlanning

DevelopmentDeploymentOperations

11

Data Documentation

DM Planning

Data Sharing

What-to-Archive

Applicability ofEDMC Directives

Cataloging

Data Citation

Data Services

Page 12: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

Dat

a L

ifec

ycle

UsageActivities

DataManagementActivities

Planning andProductionActivities

CollectionProcessing

Quality ControlDocumentation

CatalogingDisseminationPreservationStewardship

Usage TrackingFinal Disposition

DiscoveryReceptionAnalysis

Product GenerationUser Feedback

CitationTagging

Gap Assessment

Requirements DefinitionPlanning

DevelopmentDeploymentOperations

12

Some of the possible feedback

loops in the Data

Lifecycle

Page 13: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

(proposed)NOAA Data Catalog Approach

Jeff de La Beaujardière, PhDNOAA DM Architect

13

Page 14: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

Catalog Goals• Users can find NOAA data for desired

phenomenon, location and time– Without knowing Office/Program structure– Single starting to point to find the data that is

accessible via web services and well documented• Data providers can register their services

once, in a community catalog– And have their data be visible in a master catalog

• NOAA leadership can see improvements in NOAA data discovery & access

14

Page 15: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

Some Existing Community-Specific Catalogs

15

IOOSCatalog

Data

UAFCatalog

Services

NGDCGeoportal

NODCGeoportal

CWIC CLASS Catalog

GeoPlatform (ArcGIS.com Portal)

NCDCGeoportal

Page 16: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

Conceptual NOAA Distributed Catalog Architecture

Data

NOAA Master Catalog

NOAA WebSite

UI

Community Catalogs

data.gov

API

GEOSS

API

federated search(or scheduled harvest)

NCDC NODCIOOSUAF

Users & Clients

16

AnalysisTools

API

Services

NGDC others...

others...

Page 17: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

(possiblycolocated)Archive

ConOps

Data Management Overview Graphic:Connections and Information Flow

17

DMPlan* Data

Documentation*(Metadata)

ArchiveDecision*

Data AccessService

OAISReference

Model

DataInventory

MetricsDashboard

CatalogService

ID

Tools

Result• paper• decision• policy• responseID

createwrite

assess

preserve

guide

add

publish*

understand

get find

register

compile measure

analyze

useDataProducer

DataUser

cite

*topic of current EDMC Directive

publish

Archive

[OV-2](Note: Not all

activities illustrated)Requirements Gap

Assessment assessguide

guide

NOAA Leadership

assess

Page 18: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

BACKUP SLIDES

18

Page 19: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

DM Principles from NEO StrategyPrinciples• Full and Open Access:

Earth observations should be made fully and openly available to all users promptly, in a non-discriminatory manner, and free of charge.

• Preservation:Earth observations should be managed as an asset and preserved for future use.

• Information Quality:Earth observations should be of known quality and fully documented.

• Ease of Use:Earth observations should be easily discoverable and accessible online using interoperable services and standardized formats that encourage the broadest possible use.

19

from National Earth Observations (NEO) Strategy- Data Management Chapter (in preparation 2011)

Page 20: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

Procedural Directive Data Management Planning (DMP)

• Summary– Directs managers of all projects and systems that produce data to write DM Plans

• Provides guidance on content of DM Plans, including:– General description of the data– Data documentation and standards– Data access methods– Initial data storage and long-term preservation– Provides a DMP template and FAQs

• Feedback– Hundreds of comments through briefings, workshops, and meetings shaped

principles, concepts and final text.– 117 comments received during official 30-day comment period

• EDMC approval was unanimous

2020

Page 21: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

Procedural Directive Data Documentation

• Summary:– Directs NOAA programs to provide data documentation (metadata)– Requires use of ISO 19115/19139

• Provides guidance on metadata content, including:– Metadata for Discovery– Metadata for Use– Metadata and Documentation for Understanding– Documentation of Collections– Documentation of Datasets– Documentation of Services

•Highlights metadata resources, tools and challenges• EDMC approval was unanimous

2121

Page 22: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

Procedural Directive:Data Sharing by NOAA Grantees

• Summary– Directs NOAA grantees to make their environmental data publicly available– Requires data sharing plan to be provided with new proposals and published

at award– Data must be shared in a "timely" fashion but no later than two years after

collection– Exceptions or extensions granted for legal reasons or on a case-by-case basis

upon request• Provides guidance on data sharing plans

– Includes metadata – FAQs and template

• Feedback– EDMC approval – Feedback from Cooperative Institutes and Sea Grant Program

22

Page 23: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

Good Data Management supportsNOAA Leadership Priorities

23

NOAA Data

Good Documentation

DataInventory

MetricsDashboard

Data Catalog• ______• ______• ______

StandardizedServices+

enable

Ability to find, access,

understandNOAA data

Visibilityin data.govand GEOSS

enables

selectedNOAALeadershipPrioritiesforNOAA data

+

Page 24: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

NOAA Master Catalog

metadatarecord

Tag Database metadatarecord

metadatarecord metadata

record

metadatarecord

metadatarecord

metadatarecord

metadatarecord

DWH

data.gov

GEOSS CORE

GEOSS StP

Purpose E

Purpose F

DWH Response data.gov GEOSS

Data CORE

ExternalCatalogs

orPortals

otherportal

Tags are not inserted into

metadata records by data providers.

Instead, the Catalog adds tags to

indicate datasets relevant to a

particular purpose.

Datasets with a relevant tag are

recorded by external catalogs.

Tagging Concept

24

Page 25: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

Potential Relationship of GeoPlatform to NOAA Master Catalog

B) GeoPlatform is Master Catalog

CommunityCatalogs

Cat. 1 Cat. 2

GeoPlatformMap & Data Svcs

Cat. N

D) Master Catalog feeds GeoPlatform

CommunityCatalogs

Cat. 1

Cat. 2

MasterCatalog

Map & Data Svcs

Cat. N

GeoPlatformMap Svcs Only

C) GeoPlatform feeds Master Catalog

Cat. 1

Cat. 2

Master CatalogMap & Data Svcs

GeoPlatformMap Svcs

Only

CommunityCatalogs

25

A) No relation

MasterCatalog

Cat. 1

Cat. 2

GeoPlatformMap Svcs Only

WMS 1

WMS 2

Community CatalogsMap Services

Page 26: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

GeoPlatform and Master Catalog working together

NOAA Master Catalog(Geoportal or t.b.d.)

Web-basedMap Viewer

UI

data

service

Catalog1

Cat.2

Catalog3

26

GeoPlatform(ArcGIS.com Portal)

data.gov

CS/W

GEOSS

Other API

othercatalog

GCMD

WAFList of WMS

List of manualregistrations

ArcGISserver

Shapefile

KML

Man

ual

regi

stra

tion

Some datasets might be registered

directly in GeoPlatform

Page 27: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

griddeddata

griddeddata

griddeddata

UAF Distributed Catalog Architecture

Project Data &

Services

Unified Access Framework (UAF) Catalog

Project Catalogs

DAP

THREDDSCatalog

DAP

THREDDSCatalog

THREDDS Catalog

DAP

AnalysisTools

27

Matlab

API

IDVArcGIS ERDDAP

Community Catalog

Page 28: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

Use Google instead of a Dedicated Catalog?

Project Data &

Services

Google & other search engine crawlers

NOAA WebSite

?

Community Catalogs

data.gov GEOSS

agreed convention to identify geodata servers

(e.g., /geodata.xml )

data

service

Users & Clients

28

? ?

Page 29: NOAA Data Management Activities Deirdre Jones, EDMC Chair Jeff de La Beaujardière, DM Architect Prepared for DAARWG 2011-11-15 1.

Probably want both formal catalog & search engine support

Project Data &

Services

NOAA Master Catalog(machine API, spatial &

temporal queries, controlled vocabularies)

NOAA WebSite

UI

Community Catalogs

externalcatalogs

API

generalusers

simple search

data

service

Geoportal Server

GeoNetwork WAFTHREDDS

Catalog

Users & Clients

29

Google(free-text search)