Top Banner
Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology
26

Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

Apr 01, 2015

Download

Documents

Kolby Betterton
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

Day 4

Metadata

Statistics Canada

December 1st 2011

SIMPII – Workshop on Information Technology

Page 2: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada2

Outline

What is metadata? Standards Why is it important? Implementation example with Social Surveys

Common Tools

Page 3: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada3

What is metadata?

Definition: “Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource”*

*NISO (2004) Understanding Metadata. Bethesda, NISO Press

Describes content, quality, condition and other characteristics about data

Page 4: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada4

What is metadata?

Metadata answers questions about your data:• What is the concept?• Where is the input source?• What is it used for?• When did it changed?• Who changed the variable last?

Helps to improve the communication between:• Data developers, Data users and Organizations

Page 5: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada5

Standards

Intended to establish a common understanding of the meaning or semantics of the data

As an example in StatCan, we use :• DDI : standard for technical documentation describing

social science data

Page 6: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada6

Why is it important?

Records basic information about your data Provides a common understanding of your data Allows for reuse during Survey Development

Life Cycle Facilitates connections between systems &

services Support archiving & preservation

Page 7: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada7

Example“dog” “golden retriever puppy”

Clearly, this more specific search term is better. But it only works if someone has taken the time to associate the metadata.

Clearly, this more specific search term is better. But it only works if someone has taken the time to associate the metadata.

Page 8: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada8

Example

This puppy example illustrates not only the effectiveness of metadata but also the importance of tagging content with metadata.

If users don’t take the time to attach metadata when they create, upload, or edit documents the benefits will be lost.

DOCDOC

document name

audience

expiration date

version

department

project

Page 9: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada9

Enterprise Metadata Classification

Page 10: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada10

Common Tools Logo

Page 11: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada11

Common Tools Technical Architecture

Page 12: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada12

Solution Overview Social Survey Metadata Environment (SSME)

• Supporting environment of a metadata driven processing system

Interfaces are developed to access and manipulate appropriate metadata in support of a particular business process• Questionnaire Development (QDT)• Data Dictionary (DDT)• Processing and Specifications (PST)• Derived Variable (DVT)

Page 13: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada13

Solution Overview

Social Survey Processing Environment (SSPE)• A set of generalized processes that can be used in the

processing activities of the Survey Life Cycle.

The purpose of these processes is to allow subject matter and survey support staff to specify and run the processing of a survey in a timely fashion with high quality outputs.

Page 14: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada14

Questionnaire Development Tool screenshot

Page 15: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada15

Questionnaire Development Tool screenshot

Page 16: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada16

QDT Auto-generated ReportCELL_Q03 For which of the following reasons did she get her

cell phone?Pour quelles raisons, parmi les suivantes, a-t-elle acquis son téléphone cellulaire?

INTERVIEWER: Read categories to respondent.Mark all that apply.

INTERVIEWEUR : Lisez les catégories au répondant.Choisissez toutes les réponses appropriées.

01 It was a gift C'était un cadeau

02 In case of emergency En cas d'urgence

03 Peer influence Influence des pairs

04 Work requires it Requis pour le travail

05 To browse the Internet Pour naviguer Internet

06 To replace a regular landline phone

Pour remplacer un téléphone régulier

07 To replace another multimedia player

Pour remplacer un autre appareil multimédia

08 Other Autres

DK, RF NSP, RF

Page 17: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada17

Processing Specifications Tool

Page 18: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada18

Processing Specifications Tool

Page 19: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada19

Processing Specifications Tool

Page 20: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada20

Data Dictionary Tool output

Code Answer Categories Frequencies Population %

1 Yes 22,345 4,746,561 17

2 No 108,655 23,080,670 82

6 Valid skip 950 201,801 1

7 Don’t know 3 637 0

8 Refusal 1 212 0

9 Not Stated 5 1062 0

Total 131,959 28,030,943 100

Variable Name: CELL_03A Length: 1 Position: 5 Question Name: CELL_Q03 Concept: Reasons to get a cell phone – Gift

Question: For which of the following reasons did you get your cell phone? – Gift Universe: Respondents who answered CELL_1=1

Page 21: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada21

Common Tools Entity Relationship Diagram

Page 22: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada22

Common Tools Portal

Page 23: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

SDMX

Statistical Data and Metadata eXchange (born in 2002)- Standardization for statistical data and metadata access and exchange- Between NSO’s and international organizations- Within a national statistical system - Within an organization- For dissemination

Sponsors: BIS, ECB, EUROSTAT, IMF, OECD, UN, World Bank 1) Technical standards (v1: ISO 17369)

- XML-based message formats (SDMX-ML)- GESMES and the UN/EDIFACT-based message formats- Guidelines for SDMX web service implementations- SDMX registry specification (“yellow pages”)

2) SDMX Content-Oriented Guidelines- Statistical subject-matter domains (to locate data and working groups)- Cross-domain concepts/code lists (incl. metadata concepts, mapping if difficult to agree)- Metadata common vocabulary (terminology)

Page 24: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

SDMX Plans for Statistics Canada

Create SDMX-ML outputs from CANSIM Investigate OECD implementation of SDMX

using .STAT software Participate in Statistical network -- Innovation in

dissemination, Machine to machine transfer stream with Stats New Zealand, Australian Bureau of Statistics

Investigate implementation of SDMX Reference Infrastructure from Eurostat

Page 25: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada25

Conclusion

Communication is key to collaboration Help for decision making Reduces system and data redundancy Enables enterprise-wide application

development

Page 26: Day 4 Metadata Statistics Canada December 1 st 2011 SIMPII – Workshop on Information Technology.

23-04-11Statistics Canada • Statistique Canada26

Jean LabbéField IT ManagerStatistical Information System Division Informatics Branch(613) [email protected]

Xie xie