Top Banner
1 STFC testbed
46

1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

Mar 27, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

1

STFC testbed

Page 2: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

Testbed Aims•Demonstrate complete solutions at different

cost levels•Produce an Analysis Methodology

•Produce Modelling Technique•Produce preservation plans and a record of the decision making process which facilitate audit

and review•Produce exemplars and training materials

which promote the adoption of tools•Highlight organisational issues at STFC

2

Page 3: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

Structured Management of Preservation Processes

3

Page 4: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

Preservation Analysis Workflow

4

Page 5: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

CASPAR Questionnaire

The CASPAR questionnaire contains keys questions which allow you to carry out a preliminary investigation into an archive data holdings. The CASPAR questionnaire is strongly guided by OAIS and the CASPAR architecture. It lays out 13 key questions which critically allow you to.

• Understand the information extracted by users from data• Identify Preservation Description and Representation information • Develop a clearer understanding of the data and what is necessary for is effective re-

use• Understand relationships between the data files and what constitutes a digital object

within the archive• While it is appreciated that this questionnaire is not an exhaustive list of questions

which one may need to ask about a preservation target it still provides sufficient information to commence the analysis process

5

Page 6: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

Stakeholder Analysis

After carrying out the questionnaire process for each of archive it became necessary to carry out a stakeholder analysis for these archives. This is due to

• Stakeholders having differing views of the knowledge a data set was capable of providing an end user

• Stakeholders identifying different end users who possess varying skill sets and knowledge base

• Stakeholders producing or being custodians of different information vital for re-use of data

6

Page 7: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

Archive Evolution and ManagementIn addition to familiarizing oneself with the stakeholders from the different categories

it was additionally beneficial to understand how an archive has evolved and been managed. This can used to illuminate the different uses of data over time and the

production of associated representation information vital for that type of use 

7

The diagram below is a graphical representation of the awareness the different stakeholders have of data use by scientists and their relationships to each other.

Page 8: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

A tale of two archives

MST Data Archive

Ionsonde data

Page 9: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

 

Factors which influenced the use and re-use of data over time

• Birth and development of a science• Events which influence data use such as the second world war or global warming• Development of countries technologies and the emergence of global networks • Publication of journals technical manuals, interpretative handbooks, conference

proceeding, minutes of user group meetings, software etc. • Emergence of branches of science and associated organisations• Stewardship of data and the influence of different custodians

This is not an exhaustive list as many factors influencing data re-use are domain specific as is the categorization of the stakeholders. The generic principal of carrying out stakeholder characterization and the identification of factors will be domain independent.

9

Page 10: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

The Designated User Community

The definition of the skill set is vital as it determines the limit to the amount of information which must be contained within AIP in order to satisfy a preservation objective. In order to do this the definition of the designated community must be

• Clear with sufficient detail to permit meaningful decisions to made regarding information requirements for effective re-use of the data.

• Realistic and stable in so far as there is reasonable confidence in the persistence of the knowledge base and skill set.

 

While the need to define the designated user community is universal, the nature of a knowledge and skill set will tend to be domain specific. The following are typical examples from atmospheric science

• Ability of a community to successfully operate software i.e. knowledge of correct syntax to input commands into a UNIX command line.

• Ability to utilise correct analysis techniques with data to remove background noise or identify specific phenomena• Comprehension of community vocabularies• Appreciation of different scientific techniques employed during the production of data, their limitations and comparative

success rates for picking up desired phenomena.• Knowledge of atmospheric events or processes which may be affecting the atmospheric state being measured within a data

set. 

• It is the appraisal of this knowledge skills base as permanent attribute of the designated user community which will determine whether it is necessary to preserve this information by including it within an AIP (Archival Information Package).

10

Page 11: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

Defining a preservation objective

The analysis carried out before this point may present you with a natural easily defined preservation objective or alternatively there may be a greater number of options which overlap and are more difficult to define. It is important to note that this type of analysis cannot advise you as to which preservation option to choose but merely clarifies the options available to you.

 

Preservation objectives should be • Specific well defined and clear to anyone with a basic knowledge of the domain • Actionable the objective should be currently achievable. It is important to note the

information ultimately to be extracted by a user should be established and not an attempt to “predict the future”

• Measureable it is critical to know when the objective has been attained in order to assess if any preservation strategy developed is adequate.

11

Page 12: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

12

Create Preservation Information Flow

Page 13: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

Preservation Plan

A preservation plan consists of a unique•Set of information objects •Set of supply relationships•Set of preservation strategies

Which allow you to carry out a series of clear actions in order to create an AIP. This allows you to take a number of plans to the cost/benefit stage

13

Page 14: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

Ionosonde Simple Scenario

A user from a future designated community should be able to the following fourteen standard Ionospheric parameters from the data for a given station and time. They should also be able to understand what these parameters represent. Fmin, foE’ h_E,foes h_Es, type of Es, fbEs, foF1, M(3000)F1, h_F, h_F2, foF2, fx , M(3000)F2

14

Page 15: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

Cost/Benefit Analysis

Plan options can then be assessed according to •Costs to archive directly as well as the resources knowledge and time of archive staff•Benefits to future users which ease and facilitate re-use of data•Risks – what are the risks inherent the preservation strategies and are they acceptable to the archive.

15

Page 16: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

16

Page 17: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

Sometimes solutions are very simple

17

Page 18: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

IO1.1 New RepInfo

18

Page 19: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

IO1.2 DEDSL dictionary

19

Page 20: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

Ionosonde Complex Solutions

20

The second preservation scenario for the Ionsonde can only be carried out for 7 European stations but will allow a consistent Ionogram record for the Chilton site which dates back to the 1920’s. A user from a future designated community should be able reproduce an Ionogram from the raw mmm/sao data files and have access to the Ionospheric Monitoring groups website, the URSII handbooks of interpretation and Lowell technical documentation.

Page 21: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

Being able to preserve the Ionogram record is significant as it a much richer source of information, more accurately able to covey the state of

the atmosphere when correctly interpreted.

21

Page 22: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

This Objective requires a separate AIP as the content is different

22

Page 23: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

IO2.1 SAO Explorer

23

Page 24: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

24

IO2.9 EAST description

We can use EAST as a back up strategy while it is preferable to use the archived software this solution is likely to fail and scientist can

then refer to the EAST description to recreate the Ionogram

Page 25: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

IO2.2 &2.3 Documentation

25

The Network changes in reaction to shifts in the designated community. For example if the Ionospheric monitoring is disbanded we can added a

bibliography of their recommended texts

Page 26: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

IO2.4 IO2.5 Ionospheric Monitoring group website

26

Note we reuse solutions from the MST data set

Page 27: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

IO1.3 Authenticity

27

Page 28: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

MST Scenario 1

A user from a future designated user community should be able to extract the following information from the data for a given altitude and time•Horizontal wind speed and direction•Wind sheer•Signal Velocity•Signal Power•Aspect•Correlated Spectral Width

28

Page 29: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

29

Page 30: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

MST Scenario2

In addition future users should have access to User group notes, MST conference proceedings and peer reviewed literature published by previous data users.

MST  Scenario2 has a higher level preservation objective and can be considered an extension of scenario 1 as the AIP information content is simply extended. The significance of this is that future data users will have access to important information which will help in the studying the following types of phenomena captured within the data

• Precipitation• Convection• Gravity Waves• Rossby Waves• Mesoscale and Microscale Structures• Fallstreak Clouds• Ozone Layering

30

Page 31: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

31

Page 32: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

Modelling the Solution

32

Page 33: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

Risks, Tolerances and Termination

33

Websites can still supply required information after

loss of images

Tolerances can also be the differences between

two objectives

Page 34: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

NetCDF keeping the good

34

•NetCDF is a portable self-describing binary data format so is ideal for capture of provenance, descriptive and semantic information.•NetCDF is network-transparent, meaning that it can be accessed by computers that store integers, characters and floating-point numbers in different ways. This provides some protection against technology obsolescence.•NetCDF datasets can be read and written in a number of languages, these include C, C++, FORTRAN, IDL, Python, Perl, and Java. The spread of languages capable of reading these ensure greater longevity of access because as one language becomes obsolete the community can move to another.•The different language implementations are freely available from the UNIDATA and NetCDF is completely and methodically documented in UNIDATA's NetCDF User's Guide making capture of necessary representation information a relatively easy low cost option.•Several groups have defined conventions for netCDF files, to enable the exchange of data. BADC has adopted the Climate and Forecasting (CF) conventions for netCDF data and have created a standard names

Page 35: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

Solutions based on multiple strategies

35

Modelling Networks facilitates the

creation of labels in the registry and

identified risks/dependencies

can be set up in Knowledge /Gap

manger

Page 36: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

36

MST1.1 Meaningful reference to supporting organisations

Page 37: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

MST1.2 GAP manager and NetCDF documentation

37

Page 38: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

MST 1.4 CF standard names

38

Integrating the POM with standard community

dissemination channels such as

JISCMail

Page 39: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

MST1.5 &1.6 Website

39

Archiving a website is about more than zipping up a downloaded

version and can be supported in a number of ways

Page 40: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

MST1.7 Research resulting from use of data

40

References need to more than a standardised citation. They need to identify a repository which can be monitored.

Page 41: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

MST1.8 MST International Workshop

41

Some materials need to be directly included

in the AIP

Page 42: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

MST1.9 User group minutes

42

Where repositories have finite funding an accepted risk should be attached to the node in the network. This alert

an archive of a situation which needs to be monitored when AIP are reviewed.

Page 43: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

IO1.3 Authenticity

43

Page 44: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

Testbed Aims•Demonstrate complete solutions at different

cost levels•Produce an Analysis Methodology

•Produce Modelling Technique•Produce preservation plans and a record of the decision making process which facilitate audit

and review•Produce exemplars and training materials

which promote the adoption of tools•Highlight organisational issues at STFC

44

Page 45: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

Structured Management of Preservation Processes

45

Page 46: 1 STFC testbed. Testbed Aims Demonstrate complete solutions at different cost levels Produce an Analysis Methodology Produce Modelling Technique Produce.

46

Questions ?