Met a-data Resources in Europe: within NSIs and from Dosis Projects

Post on 14-Jan-2016

22 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Met a-data Resources in Europe: within NSIs and from Dosis Projects. Wilfried Grossmann Department of Statistics and Decision Support Systems University Vienna. Contents. Introduction Contents of Meta-data IT- Structures for Meta-data Processing Meta-data Conclusions. Introduction. - PowerPoint PPT Presentation

Transcript

Meta-data Resources in Europe: within NSIs and from

Dosis Projects

Wilfried Grossmann

Department of Statistics and Decision Support Systems

University Vienna

29.3.2000 Metadata Resources in Europe 2

Contents

Introduction

Contents of Meta-data

IT- Structures for Meta-data

Processing Meta-data

Conclusions

29.3.2000 Metadata Resources in Europe 3

Introduction

Continuing hot topics in the meta-data discussion

Content-orientation versus IT-orientation

There is a lack of communication between these two groups

29.3.2000 Metadata Resources in Europe 4

Introduction

Meta-data providers versus meta-data users

Who provides which type of information for whom?

29.3.2000 Metadata Resources in Europe 5

Contents of Meta-data

What kind of objects should be documented?

Basic statistical structures Variables Values Data sets

____________________

Statistical output Statistical Systems Statistical Processing

29.3.2000 Metadata Resources in Europe 6

Contents of Meta-data

Approaches towards meta-data content

The template oriented approach

The data warehouse approach

The process oriented approach

29.3.2000 Metadata Resources in Europe 7

Contents of Meta-data The template oriented approach

Templates defined by a number of working groups

For micro data and data setsDDI, Dublin Core

For (economic) macrodata

OECD, IMF, ECE (Internet)

29.3.2000 Metadata Resources in Europe 8

Contents of Meta-dataThe template oriented approach

The OECD Template:

Concepts and sources

Data Collection

Data manipulation by national source

Data quality

Data Transmission

International Standards

Data Storage and Manipulation by OECD

Output preparation and delivery by OECD

29.3.2000 Metadata Resources in Europe 9

Contents of Meta-data The template oriented approach

The IMF Template:

Coverage

Periodicity

Timeliness

Quality of disseminated data

Integrity of disseminated data

Access by the public

29.3.2000 Metadata Resources in Europe 10

Contents of Meta-data The template oriented approach

Although the OECD approach seems more

reliable from statistical point of view, IMF is

favoured at the moment by international

organisations (EUROSTAT)

29.3.2000 Metadata Resources in Europe 11

Contents of Meta-dataThe warehouse approach

Integration of the data inside the NSIs in a data warehouse

Output and dissemination as first step

Meta-data are oriented towards the needs of the

data warehouse

29.3.2000 Metadata Resources in Europe 12

Contents of Meta-dataThe warehouse approach

Projects in this direction in many NSI

Best documentation: Australian Office

Definitional meta-data

Procedural meta-data

Operational meta-data

Systems meta-data

Datasets meta-data

29.3.2000 Metadata Resources in Europe 13

Contents of Meta-dataThe process oriented approach

Combines statistical and IT considerations

Statistical data are considered not as final products but as the result of a process chain

More detailed consideration of statistical terminology

29.3.2000 Metadata Resources in Europe 14

Contents of Meta-dataThe process oriented approach

Starting point was the SCB-DOC model

(Rosen and Sundgren, 1991)

• A sequence of templates accompanying the statistical production process

• Ongoing activities at Statistics Sweden

• A number of NSIs want to adopt the model

29.3.2000 Metadata Resources in Europe 15

Contents of Meta-dataThe process oriented approach

The IDARESA model

Object oriented representation based on

SCB-DOC with emphasis on possible semi-automatic processing

29.3.2000 Metadata Resources in Europe 16

Contents of Meta-dataThe process oriented approach

The US-Bureau of census model

(Gillman, Appel et al. running project):

Statistical system defined as an identifiable process .... to produce one or more deliverables

29.3.2000 Metadata Resources in Europe 17

Contents of Meta-dataSummary

Process oriented approach seems

to be favourable for a number of reasons

Two Examples:

Classification servers

Data Quality

29.3.2000 Metadata Resources in Europe 18

Contents of Meta-dataSummary: Classification server

A classification server should

Support unified use of terminology inside NSIs or international organisations

Support harmonisation between (international) standard classifications and locally defined (adapted) classifications

29.3.2000 Metadata Resources in Europe 19

Contents of Meta-data Summary: Classification server

Requirements for a classification server

• A data base supporting easy and user friendly manipulation of hierarchy trees

• A mapping tool supporting the definition of correspondence tables between classifications

• A management strategy for implementation

29.3.2000 Metadata Resources in Europe 20

Contents of Meta-data Summary: Classification server

Up to now only few successful implementations

for partial solutions

EUROSTAT (SIMONE-Server)

New Zealand,

29.3.2000 Metadata Resources in Europe 21

Contents of Meta-data Summary: Data Quality

Data Quality Criteria for quality of statistics are well known

(Relevance, accuracy, timeliness, accessibility, comparability, coherence, completeness)

The problem

• Achieve quality in the production process

• Document quality by appropriate meta-data

29.3.2000 Metadata Resources in Europe 22

Contents of Meta-data Summary: Data Quality

Experience shows that documentation

quality is rather poor as soon as it is

separated from the production process

Example for an integration project

SIDI-approach by ISTAT

29.3.2000 Metadata Resources in Europe 23

IT Structures for Meta-data

Internet and data warehouse offer new opportunities for

Meta-data and data repositories

Meta-data access and exchange

Lead towards a more open policy in data dissemination

29.3.2000 Metadata Resources in Europe 24

IT Structures for Meta-dataMeta-data repositories

Approaches towards repositories

The thesaurus approach

The template oriented approach

The Data Warehouse oriented approach

29.3.2000 Metadata Resources in Europe 25

IT Structures for Meta-dataMeta-data repositories

Example for a thesaurus oriented approach

EUROSTAT servers for concepts and

definitions

• Advantage: available on the Internet

• Problem: Navigation not so easy

29.3.2000 Metadata Resources in Europe 26

IT Structures for Meta-dataMeta-data repositories

• Contents

– Descriptions (dictionaries)

– Semantic (coverage, standard classifications coherence of information)

– Administration (responsible persons)

– Selection (keywords, search facilities)

29.3.2000 Metadata Resources in Europe 27

IT Structures for Meta-dataMeta-data repositories

Example for the template oriented approach

StatBase: supporting access to meta-data as well as data and reports

• Meets quite well the requirements of OECD data template

• No direct connection between data and meta-data

29.3.2000 Metadata Resources in Europe 28

IT Structures for Meta-dataMeta-data repositories

Example for the warehouse oriented approach

StatLine(CBS): Based on data access from multidimensional tables (cubes)

• Accompanying meta-information is only in Dutch

• Extraction of special meta-data items is not so easy as in StatBase

29.3.2000 Metadata Resources in Europe 29

IT Structures for Meta-dataMeta-data access and exchange

Ongoing work in access and exchange

New Standards for access and exchange

Accessing distributed sources

Combination of information

29.3.2000 Metadata Resources in Europe 30

IT Structures for Meta-dataMeta-data access and exchange

Actual trends in standardization

• Traditional standards for data and meta-data exchange like GESMES or CLASET will probably switch to XML-platform.

• New standards from the Object Management Group (OMG)

29.3.2000 Metadata Resources in Europe 31

IT Structures for Meta-dataMeta-data access and exchange

Example MOF (Meta Object Facility)

– Extensible Framework for meta-data model definition

– Programming interface for storage and access of meta-data

– Integration facilities across domains

But note: This is a general approach for warehouses not necessarily tied with statistics

29.3.2000 Metadata Resources in Europe 32

IT Structures for Meta-dataMeta-data access and exchange

Example for Accessing and processing distributed sources

ADDSIA: Accessing and processing distributed sources for analysis purposes

• Minimum requirements for standardisation in advance

• Orientation towards statistical problems

29.3.2000 Metadata Resources in Europe 33

Processing Meta-data

Goal Data and meta-data are processed

together

<OldDataSets, OldMetadataSets>

<NewData, NewMetadata>

29.3.2000 Metadata Resources in Europe 34

Processing Meta-data

Advantages Reduction of documentation effort

More consistency in meta-data

Requirements Software tools supporting this view

Operational models for meta-data

29.3.2000 Metadata Resources in Europe 35

Processing Meta-data

Up to know only prototypes with emphasis

on different aspects of processing

The planning approach

The throughput approach

The transformation approach

29.3.2000 Metadata Resources in Europe 36

Processing Meta-dataThe planning approach

Develop software tools (workbench) for setting up meta-data documentation

BRIDGE/IMIM: A desktop for planning surveys and statistical

production Meta-data generated in the planning phase are

managed by the system No data are processed

29.3.2000 Metadata Resources in Europe 37

Processing Meta-dataThe planning approach

Improvement and adaptation of meta-data models for new tasks like quality and use of administrative sources

SIDI (Statistics Italy) Integration of quality in the statistical

production process Standardization of the production process

29.3.2000 Metadata Resources in Europe 38

Processing Meta-dataThe throughput approach

Use as much meta-data as possible from OldMeta-data to obtain NewMeta-data

CBS (ongoing work):

Use BLAISE meta-data as input Produce StatLine meta-data as output

29.3.2000 Metadata Resources in Europe 39

Processing Meta-dataThe transformation approach

Define meta-data algorithms for all types of data algorithms

Throughput meta-data Modified meta-data New meta-data Meta-data summarization

29.3.2000 Metadata Resources in Europe 40

Processing Meta-dataThe transformation approach

IDARESA project

Meta-data algorithms for elementary data base operations

ISMIS

Identification of added value in meta-data (new meta-data)

Pursuit of the production process inside EUROSTAT

41Metadata Resources in Europe29.3.2000

Processing Meta-dataThe transformation approach

In p u t d a ta 1In p u t M eta -d a ta 1

In p u t d a ta 2In p u t M eta -d a ta 2

In te rim d a ta 1In te rim M eta-d a ta 1

In p u t d a ta 3In p u t M eta -d a ta 3

In te rim d a ta 4In te rim m eta-d a ta 4

In p u t d a ta 4In p u t M eta -d a ta 4

In p u t d a ta 5In p u t M eta -d a ta 5

In p u t d a ta 6In p u t M eta -d a ta 6

In te rim d a ta 2In te rim M eta-d a ta2

In p u t d a ta 7In p u t M eta -d a ta 7

In te rim d a ta 3In te rim M eta-d a ta 3

In te rim d a ta 5In te rim m eta-d a ta 5

O u tp u t d a taO u tp u t M eta-d a ta

29.3.2000 Metadata Resources in Europe 42

Conclusions

Is there progress in meta-data research and development?

Yes, but rather slow because There is a lack of co-ordination in research

(Probably improved by a forthcoming meta-data working group)

There is an information gap between meta-data research groups and NSIs

NSIs seem to prefer their own solutions

top related