std-doi Publication of Climate Data at WDCCDataCite Summer Meeting 7./8. June 2010
Publication of climate data
Heinke HöckWorld Data Center for
Climate (WDCC)
© DKRZ
• Climate Data and Metadata at WDCC
• Preconditions
• Workflow1. Permission
2. SQA - Scientific Quality Assurance
3. TQA - Technical Quality Assurance
4. Publication
• Future
Content
10.04.23
2 / 10
Heinke HöckDataCite Summer Meeting 2010
© DKRZ
Climate System
10.04.23
3 / 10
Put Your Name HereYour Conference
© DKRZ
• Climate model results from global and regional climate models from different climate modelling centres
CCCma, CCSR/NIES, CSIRO, GFDL, HADLEY, MPIfM , NCAR based on IPCC-emission scenarios
• Data from scientific projects HOAPS (satellite data), CARIBIC (civil aircraft data),
GOP, COPS
• Model like Observations
Reanalyses data
Climate Data at WDCC
10.04.23
4 / 10
Heinke HöckDataCite Summer Meeting 2010
© DKRZ
General Statistics and Structure of Data
10.04.23
5 / 10
Heinke HöckDataCite Summer Meeting 2010
EXPERIMENTS: 1400std-doi publication
DATASETS: 170 000
formats: GRIB (WMO), NetCDF …
WDCC Database Size: 428 Tbyte
collection of datasets
© DKRZ
Metadata at WDCC (CERA2)
10.04.23
6 / 10
Heinke HöckDataCite Summer Meeting 2010
Entry
Reference
Status
Distribution
Contact Coverage
Parameter
SpatialReferenceLocal Adm.
Data Access
Data Org
© DKRZ
• Climate model results std-doi publication scheme is developed
Data publications can be obtained from TIB library catalogue
Future: implementation for IPCC-AR5
• Data from scientific projectsProject funded by DFG together with University of Bonn and the Bonn-Rhine-Sieg University of Applied Sciences
Development of an automated standard procedure for std-doi publication process
std-doi publications at WDCC
10.04.23
7 / 10
Heinke HöckDataCite Summer Meeting 2010
© DKRZ
long term availability of data at WDCC
long term availabilty of metadata at WDCC
open access to data and metadata
Preconditions
10.04.23
8 / 10
Heinke HöckDataCite Summer Meeting 2010
© DKRZ
Workflow Processes
10.04.23
9 / 10
TIB
Sc
ien
tist
WD
CC
Permission Publication
TIME
SQA TQA
Scientific Quality Assurance – SQA
Technical Quality Assurance – TQA
© DKRZ
Permission
10.04.23
10 / 10
Put Your Name HereYour Conference
Who is allowed to initiate the std-doi publication process ?
no standard process
WDCC: Investigator of the Experiment
•Today: e-mail
•Future: browser interface with account and according experiment list
© DKRZ
SQA - Scientific Quality Assurance
Scientist
10.04.23
11 / 10
Heinke HöckDataCite Summer Meeting 2010
Business Process Modeling Notation (www.signavio.com/en.html)
© DKRZ
• std-doi profile sent to TIBTitlePublication DateAuthor(s)DescriptionDatasizeData Format(s)
• Metadata of DOI/URN resolved compact siteLocation(s)Spatial and Temporal CoverageContactList of Datasets (Topics)
SQA Review Required Metadata
10.04.23
12 / 10
Heinke HöckDataCite Summer Meeting 2010
© DKRZ
DOI Compact
10.04.23
13 / 10
Put Your Name HereYour Conference
© DKRZ
SQA Data Scientist
10.04.23
14 / 10
Heinke HöckDataCite Summer Meeting 2010
•Approved by author(s)
•Short description of quality checks done by author(s)
•Quality check protocol files done by author(s)
Virtual window
© DKRZ
Experiment: 10.1594/WDCC/CLM_A1B_2_D3
1) Quality documentation see• 'README, Plots and Reports for CLM regional climate model runs'
in CERA2
• http://cera-www.dkrz.de/WDCC/ui/Entry.jsp?acronym=CLM_PLOTS_2008 and 'CLM Technical Report'
• http://www.mad.zmaw.de/fileadmin/extern/documents/reports/MaD_TechRep3_CLM.pdf Chapter 4 and 6.
2) Control of timeseries:• creation of minimum, maximum, mean, average timeseries of
every record
• control of timeseries with statistical analysis
SQA Example Short Description
10.04.23
15 / 10
Heinke HöckDataCite Summer Meeting 2010
© DKRZ
• Number of data sets is correct and not equal 0
• Size of every data set is not equal 0
• The data sets and corresponding metadata are all accessible via internet
• The data size is controlled and correct
• The time description (metadata) and existence of data are consistent.
complete, start- stop date consistent, continuous time steps are correct
• Format is correct
• Variable description and data are consistent
TQA - Technical Quality
Assurance WDCC
10.04.23
16 / 10
Heinke HöckDataCite Summer Meeting 2010
© DKRZ
persistent ídentifiers (DOI/URN) need persistent objects (DATA)
Data• no change is possible after std-doi publication
Metadata of distribution is fixed• Datasize• Data Format(s)
Metadata of citation is fixed• Author(s), Title and Publication Date
Fixing Data and Metadata
10.04.23
17 / 10
Heinke HöckDataCite Summer Meeting 2010
© DKRZ
Publication
10.04.23
18 / 10
Put Your Name HereYour Conference
TIB (Registration Agency)
TIBORDER
WDCC (Publication Agent)
Metadata and Data Access via Internet
DOI-Resolver
Creation of STD-DOI metadata
Creation of DOI/URN
integration
DOI
URL
link
integration
InformScientist
© DKRZ
• SQA and TQA toolbox (examples) for standardization of quality control
• Two browser interfaces for workflow processes1. Scientist (virtual windows, deployment tests)
2. Publication agent
Future
10.04.23
19 / 10
Heinke HöckDataCite Summer Meeting 2010
© DKRZ
Thank you for your Attention!http://www.dkrz.de
http://www.wdc-climate.de
http://umwelt.wikidora.com
20 / 10
10.04.23Heinke HöckDataCite Summer Meeting 2010