Page 1
NOAA Data Management Policies & Implementation
GEO Plenary Side Event: Best Practices in Data Management Policy & Implementation
2017-10-23
Jeff de La Beaujardière, PhD National Oceanic and Atmospheric Administration
NOAA Data Management Architect
[email protected]
Page 2
NOAA has "Big Data" (Volume, Variety, Velocity, ...)
• Satellites
• Weather radars
• Ocean bathymetry
• Buoy networks
• Tide gauges
• Ships
• Aircraft
• Autonomous vehicles
• Human observers
• Numerical models
These data are unique, valuable, irreplaceable, and collected at public expense
20
17
-10
-23
2
Jeff.deLaB
eaujard
iere@n
oaa.go
v
Page 3
3
Vision for NOAA Data Management
Discoverable
All NOAA environmental data shall be
for all types of users and applications.
Accessible Usable Preserved
20
17
-10
-23
Jeff.d
eLaBeau
jardiere@
no
aa.gov
Page 4
NOAA Administrative Order 212-15: Management of Environmental Data
(2010)
NOAA Environmental Data Management Framework (2012-2013)
NOAA Data Policies https://nosc.noaa.gov/EDMC/
Data Sharing Directive for NOAA Grantees
(2012; rev. 2016)
Data Documentation Directive
(2011; rev. 2016)
Data Management Planning Directive
(2011; rev. 2015)
Archive Appraisal Procedure (2008)
Data Access Directive (2015)
Data Citation Directive (2015)
20
17
-10
-23
4
Jeff.deLaB
eaujard
iere@n
oaa.go
v
Page 5
NOAA Data Catalog (est. 2013) 2
01
7-0
9-2
7
5
Jeff.deLaB
eaujard
iere@n
oaa.go
v
Collaborators: Chris MacDermaid,
NOAA Catalog WG
NOAA OneStop for NCEI data
(Ken Casey, OneStop team)
data.noaa.gov
Page 6
Data Access Support 2
01
7-0
9-2
7
Jeff.deLaB
eaujard
iere@n
oaa.go
v
6
Bob Simons, Roy Mendelssohn (NMFS)
Kevin O'Brien,
Eugene Burger
(OAR)
• ERDDAP
– Hosts & serves gridded & tabular data
• Unified Access Framework (UAF)
– Serves netCDF data on THREDDS & ERDDAP
servers
Page 7
Collaborators:
NOAA Dataset ID WG
DOI (Digital Object
Identifier)
NCEI Archive (National Centers for Environmental Info.)
landing page Data &
Metadata
Dataset Identifier Project 2
01
7-1
0-2
3
Jeff.deLaB
eaujard
iere@n
oaa.go
v
7
DOI benefits:
• Permanent, citable ID.
• International standard (ISO 26324).
• Recognition by publishers.
Page 8
Data Management is not the goal
We don't want to just "manage" data. We want to use and reuse data, and
extract maximum value from it.
20
17
-10
-23
Jeff.d
eLaBeau
jardiere@
no
aa.gov
8
Page 9
20
17
-10
-23
9
Jeff.deLaB
eaujard
iere@n
oaa.go
v
Data to Decisions:
Distill huge & complex data to ~1 bit: plant crop? evacuate? build wind farm? go skiing?
Support non-expert data users
Users need answers, not huge datasets (... or 100s of tiny datasets)
Page 10
Challenges
Data
Volume
Data
Complexity
20
17
-10
-23
Jeff.d
eLaBeau
jardiere@
no
aa.gov
10
Page 11
data services layer
Data Access Services
Data Search & Discovery Services
Data.gov
and
Other Portals
Data
Sources Satellite Radar Buoy Ship Sonar Surveys Models Gliders
Data Documentation (Metadata)
Compatible Formats and Vocabularies
User
Tools
Decision
Support
Tools
Scientific
Software
Value-
Adding
Reseller
Traditional Data Services Approach - theory
11
Numerical
Models
shared
standards
20
17
-10
-23
Jeff.d
eLaBeau
jardiere@
no
aa.gov
Page 12
Data
Sources
Traditional Data Services Approach - reality
12
Satellite
data
access
Radar
data
access
Buoy
data
access
Ship
data
access
Sonar
data
access
Surveys
data
access
ROV/UAV
data
access
Models
data
access
Data
Discovery
User
Facilities User
Hardware
User
Hardware
User
Hardware
User
Hardware
copy of data copy of data copy of data copy of data
20
17
-10
-23
Jeff.d
eLaBeau
jardiere@
no
aa.gov
• Not scalable as data
volumes increase
• Security risk of every
on-premises service
• Maintenance burden of
on-prem infrastructure
Page 13
Commercial Cloud
Notional Cloud Deployment Scenario 2
01
7-1
0-2
3
Jeff.deLaB
eaujard
iere@n
oaa.go
v
13
Master copy of NOAA Data
NOAA security boundary
One-way
push
Public
users
On-premises Computing
Operational Processing
Operational customers
Decision-support
functions
Derived from NOAA EDM Framework (2013), figure 8
Forecast Models
Information
Products
Page 14
NOAA Big Data Project (R&D)
selected datasets
20
17
-10
-23
14
www.noaa.gov/big-data-project
Jeff.deLaB
eaujard
iere@n
oaa.go
v
Page 15
Wish #1: Full Use of the Cloud 2
01
7-1
0-2
3
Jeff.deLaB
eaujard
iere@n
oaa.go
v
15
Archive
Operational Customers (e.g., NWS)
Cloud Challenges:
• Egress costs vs free data
• Uncertain/unbounded costs
• Re-architecting for performance vs fork-lifting existing apps
• IT security policy mismatch
Page 16
Wish #2: Composable Functions for Decision Support
Model Outputs
Earth Observations
Ancillary Data
Decision Support
Functions
Policy & Business Decisions
complicated,
multi-source
data
non-scientist
users
Composable functions to create workflows for:
• Derived information products
• Multi-source data integration
• Location-specific analysis
• Statistics & Trends
• Novel analyses & discoveries
20
17
-10
-23
16
Jeff.deLaB
eaujard
iere@n
oaa.go
v
Page 17
Questions?
Jeff de La Beaujardière, PhD
[email protected]
20
17
-10
-23
17
Jeff.deLaB
eaujard
iere@n
oaa.go
v